[go: nahoru, domu]

CN109960708A - Data processing method, device, electronic equipment and storage medium - Google Patents

Data processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN109960708A
CN109960708A CN201910221391.2A CN201910221391A CN109960708A CN 109960708 A CN109960708 A CN 109960708A CN 201910221391 A CN201910221391 A CN 201910221391A CN 109960708 A CN109960708 A CN 109960708A
Authority
CN
China
Prior art keywords
data
source system
file
database
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910221391.2A
Other languages
Chinese (zh)
Inventor
王辉
姜长江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rose Wisdom Technology Co Ltd
Original Assignee
Rose Wisdom Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rose Wisdom Technology Co Ltd filed Critical Rose Wisdom Technology Co Ltd
Priority to CN201910221391.2A priority Critical patent/CN109960708A/en
Publication of CN109960708A publication Critical patent/CN109960708A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of data processing method, device, electronic equipment and storage medium, is related to technical field of data processing.This method comprises: extracting parameter according to the database of the log file in source system data library, source system data library connection type, pre-configuration, target data is extracted in the system table of source, wherein, it includes: the mark of source system and the mark of source system table that the database of pre-configuration, which extracts parameter,;The interface data file for meeting the first name requirement is generated according to target data;Data in the interface data file for meeting the first name requirement are loaded onto data warehouse;The data in the interface data file for meeting the first name requirement are integrated and calculated in data warehouse, the data after obtaining processing.Compared with the existing technology, repetitive work in ETL work data treatment process is avoided, so that the characteristics of ETL work data treatment effeciency can be improved reduces development and maintenance cost in ETL work data treatment process when executing batch data processing.

Description

Data processing method, device, electronic equipment and storage medium
Technical field
The present invention relates to technical field of data processing, set in particular to a kind of data processing method, device, electronics Standby and storage medium.
Background technique
ETL (Extract-Transform-Load, extraction-conversion-load) is for describing data from source system end By extraction, the process of goal systems and the element task of data platform, data warehouse are converted, are loaded onto, in big data Dai Zhong, such as data analysis, data mining, data visualization, data model are completed on the basis of being all based on ETL.
Include in the method for existing ETL whole design, by the logic of data processing with SQL (Structured Query Language, structured query language) form of sentence is programmed in except database, data platform, for example is programmed in ETL service On device, using Perl (Practical Extraction and Reporting Language, practical extraction and report language Speech), the scripting languages such as Python, Shell, call database client to execute the SQL statement, to reach at corresponding data Manage result.
But in existing ETL design, when developing, go into operation, safeguarding ETL operation, the ETL workload and data that need to operate Measure directly proportional, especially when task amount is big, the development and maintenance cost of ETL operation not can guarantee.
Summary of the invention
It is an object of the present invention in view of the deficiency of the prior art, provide a kind of data processing method, device, Electronic equipment and storage medium, to solve the problems, such as that development and maintenance cost is high in available data treatment process.
To achieve the above object, technical solution used in the embodiment of the present invention is as follows:
In a first aspect, the embodiment of the invention provides a kind of data processing methods, comprising:
It is extracted and is joined according to the database of the log file in source system data library, source system data library connection type, pre-configuration Number, extracts target data in the system table of source, wherein the database of pre-configuration extracts mark and the source that parameter includes: source system The mark of system table;
The interface data file for meeting the first name requirement is generated according to target data;
Data in the interface data file for meeting the first name requirement are loaded onto data warehouse;
The data in the interface data file for meeting the first name requirement are integrated in the data warehouse With calculating, the data after processing are obtained.
Optionally, the method also includes: using data warehouse as source system, according to the log file of data warehouse, number Parameter is extracted according to the database of warehouse connection type, pre-configuration, extracts target data in the data warehouse, wherein It includes: the mark of data warehouse and the mark of data warehouse table that the database of pre-configuration, which extracts parameter,;
The interface data file for meeting the first name requirement is generated according to the target data;
Data in the interface data file for meeting the first name requirement are loaded onto Data Mart.
Optionally, it is described according to the log file in source system data library, source system data library connection type, pre-configuration number Parameter is extracted according to library, extracts target data in the source system table, comprising:
It is extracted according to the database of the log file in source system data library, the connection type in source system data library, pre-configuration Parameter, multiple processes concurrently extract target data in multiple source system tables, wherein the corresponding institute of a process State source system table.
Optionally, the database of the pre-configuration extracts parameter further include: the Data Date information and pumping of source system table Mode is taken to identify, wherein the extraction mode identifies instruction increment extraction or full dose extracts;
The log file according to the source system data library, source system data library connection type, the data of pre-configuration Parameter is extracted in library, extracts target data in the source system table, comprising:
It is taken out according to the database of the log file in the source system data library, source system data library connection type, pre-configuration Take parameter, extracted in the source system table source system table the corresponding increment interface data file of Data Date information or Person's full dose interface data file.
Optionally, the data by the interface data file for meeting the first name requirement are loaded onto data bins Library, comprising:
According to the log file of data warehouse, data warehouse connection type, by the interface for meeting the first name requirement Data in data file are loaded onto the interface table for meeting the second name requirement in the data warehouse, wherein described second The interface table that name requires includes that the database of the pre-configuration extracts parameter.
Optionally, it is described in the data warehouse to the number in the interface data file for meeting the first name requirement According to being integrated and being calculated, the data after processing are obtained, comprising:
According to the log file of data warehouse, data warehouse connection type, the Data Warehouse library name, call The job file that preset language is write;
The job file is run to integrate the data in the interface data file for meeting the first name requirement With calculating, the data after processing are obtained.
Second aspect, the embodiment of the invention also provides a kind of data processing equipments, comprising: abstraction module, generation module, Loading module obtains module, in which:
The abstraction module, for according to the log file in source system data library, source system data library connection type, prewired The database set extracts parameter, extracts target data in the source system table, wherein the database of pre-configuration extracts parameter packet It includes: the mark of source system and the mark of source system table;
The generation module, for generating the interface data file for meeting the first name requirement according to the target data;
The loading module, for the data in the interface data file for meeting the first name requirement to be loaded onto number According to warehouse;
The acquisition module, in the data warehouse to the interface data file for meeting the first name requirement In data integrated and calculated, obtain processing after data.
Optionally, the abstraction module is specifically used for: according to the log file in source system data library, source system data library Connection type, the database extraction parameter of pre-configuration, multiple processes concurrently extract target data in multiple source system tables, Wherein, the corresponding source system table of a process.
The third aspect, the embodiment of the invention also provides a kind of electronic equipment, comprising: processor, storage medium and bus, Storage medium is stored with the executable machine readable instructions of processor, when electronic equipment operation, processor and storage medium it Between by bus communication, the step of processor executes machine readable instructions, data processing method to execute above-mentioned first aspect.
Fourth aspect is stored with computer on the storage medium the embodiment of the invention also provides a kind of storage medium The step of program, which executes data processing method described in above-mentioned first aspect when being run by processor.
The beneficial effects of the present invention are: data processing method provided herein includes: according to source system data library Log file, source system data library connection type, the database extraction parameter of pre-configuration, therefore, based on source system data library Log file, the data pick-up operation for batch operation carry out batch data extraction, root to target data in the system table of source The interface data file for meeting the first name requirement is generated according to target data, and data warehouse is wanted to the first name is met The data for the interface data file asked carry out batch processing, so that can will be accorded with according to the naming method of interface data file It closes the interface data files in batch that the first name requires and is loaded into the interface table that corresponding goal systems meets the second name requirement In, so as to avoid repetitive work in ETL work data treatment process, so that when executing batch data processing operation, The characteristics of ETL work data treatment effeciency can be improved reduces development and maintenance cost in ETL work data treatment process.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the flow diagram for the data processing method that one embodiment of the application provides;
Fig. 2 is the flow diagram for the data processing method that another embodiment of the application provides;
Fig. 3 is the structural schematic diagram for the data processing equipment that one embodiment of the application provides;
Fig. 4 is the structural schematic diagram for the electronic equipment that one embodiment of the application provides.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.
Firstly, before the application is introduced, first to it is used in this application to name explained accordingly, Specific explanations are as follows.
ETL: being the abbreviation of English Extract-Transform-Load, for describing data from source terminal by extracting (extract), the process of interaction conversion (transform), load (load) to destination, ETL realize that there are two types of technologies at present Framework, ELT framework and ETL framework.
ETL framework: in ETL framework, the flow direction of data is from source traffic to ETL tool, and ETL tool is one independent Data processing engine, generally can realize the work of all data conversion on individual hardware server, then add data It is downloaded in target data warehouse, if to increase the efficiency of entire ETL process, the configuration of ETL tool server can only be enhanced, Optimization system process flow.
ELT framework: in ELT framework, ELT is responsible for providing patterned interface only to design business rule, data it is whole A process flows all between target database and source database, and ELT coordinates relevant Database Systems to execute correlation Application, data mart modeling process can both execute at source database end, and can also execute at target data warehouse end, mainly take Certainly in the architecture design of system and data attribute.
Source system: being called source data provider, up-stream system, upper in data flow for data flow Trip, relative to being in the side of being supplied with for the system that data are walked downstream.
Goal systems: being called data receiver, down-stream system, for data flow under data flow Trip is target side for the up-stream system for being in data trend.
Data warehouse: proper noun, it can be understood as specially do the place of Data Integration and data calculating, physical support Usually distributed data base.Data warehouse in enterprise is the place that all source system datas are concentrated, be it is large and complete, towards Theme, Data Integration thereon is not separate service bar line.Citing: the data warehouse of bank is exactly loan system, deposit The source system datas such as system, access are drawn into data warehouse and are processed, summarized, formed client's theme, account theme, The data of the integration such as assets theme.
Data Mart: Data Mart is towards some specific department or certain concrete application for data warehouse , vivid understanding can be the subset of a small-sized data warehouse or data warehouse.Citing: the credit risk of bank is answered With being available to risk and close rule portion and use, be the data carried out according to distribution subjects such as the assets theme of client, transaction events Processing provides effective risk data support for fiduciary loan financial product.
Log file: the user of in store encryption and password.
Interface data file: source data provider is interior at the appointed time to complete to carry out source system according to code requirement It periodically extracts and is formed by data file, general interface data file needs regulation file coding format (such as to compile using UTF8 Code format), in file field separator (as used ## as the separator between field).
Frame-type program: for solve the problems, such as some general character, support construction open and design be exactly frame, frame Formula program is exactly that general character function is carried out to the program of abstract formation using certain language.Frame-type program can reduce redundancy feature The exploitation of program reduces program quantity, uses soft code to the maximum extent, realizes most of base by seldom a part of program Plinth processing work, and then user only needs to configure the work that extraction, load and the accumulation of data can be realized in some parameters, And the platform transplantation of frame-type program is preferable.
The data processing method of the application can be adapted for the data processing of ETL framework, be readily applicable to ELT framework Data processing is first introduced the application environment of this system, which includes source system, goal systems, data source, data Warehouse, Data Mart.Wherein, when data source is relative to data warehouse, data source is source system, and data warehouse is target at this time System;And when data warehouse is relative to Data Mart, data warehouse is source system, and Data Mart is goal systems at this time.
Wherein, source system is used to complete the extraction of data, goal systems is used to complete the reception and load of data, this method The data warehouse etc. that can be used for the enterprise-levels such as the processing of business datum, such as bank, insurance, security, telecommunications, with specific reference to Family needs to be arranged, and the application does not carry out any restriction to this application scenarios.
Fig. 1 is a kind of flow diagram of data processing method provided by the present application.Data processing provided herein Method is suitable for ELT framework, can be used for ETL framework, wherein the source system in the application method can complete the data of ETL Operation process is extracted, goal systems can complete the process of processing and the load of the data of ETL, as shown in Figure 1, this method packet It includes:
S101, it is taken out according to the database of the log file in source system data library, source system data library connection type, pre-configuration Parameter is taken, extracts target data in the system table of source.
Wherein, it includes: the mark of source system and the mark of source system table that the database of pre-configuration, which extracts parameter,.
Wherein, the mark of source system can be the abbreviation of source system, source system data library type etc., the mark of source system table It may include the title etc. of source system table, the application does not do any restrictions herein.
The database of source system can be Oracle, SQL Server, MySQL, MogonDB or GBase, and the application is not The type of database of limitation source system, each source system data library can select corresponding log according to the security settings of its own File can configure corresponding database according to different source system data library types and extract parameter, be the data of source system table Operation is extracted to prepare.Wherein, security settings may include security level, safety requirements etc., and the application does not do any herein Limitation.
In a kind of possible embodiment, the username and password in source system data library is store in log file, it should Password can be encryption, which can be stored in the other systems other than the system of source.Such as: log file can be with It is stored on ETL server, different source system datas library is allowed to correspond to different log files.When needing access originator system System, can be decrypted by corresponding decipherment algorithm, username and password is obtained from log file, in addition, by source system data The log file in library and source system data library separate setting, and user is not easy to get the user name in source system data library and close Code, can promote the safety in source system data library.Certainly, the application only refers to be pacified according to source system data library here Full setting selects corresponding log file, in addition it is also possible to be stepped on accordingly according to the selection of the security settings of goal systems database File is recorded, correspondingly, the log file of goal systems database can be stored in the other systems other than goal systems database On, such as: special ETL server allows to be promoted the safety of goal systems database, and the application is no longer superfluous herein It states.
For example: when source system is loan system, corresponding database has an access username, it is assumed that it is logged in The entitled Source_Loan_Logon of file, the content of the inside just record access username and claim+password, wherein the password can be with It is according to the encrypted password of preset Encryption Algorithm, the application is defined not to this.
S102, the interface data file for meeting the first name requirement is generated according to target data.
In the application, interface data file, which refers to, extracts parameter according to the database in source system data library, using corresponding Data extraction module is within a specified time completed periodically extract to source system to be formed by data file.
In a kind of optional embodiment, when interface data file requires name according to the first name, so that number of ports According to the naming standard of file, convenient for data warehouse according to the screening rule of interface data file, corresponding interface is chosen in batches Data file carries out data processing, so that improving interface data files in batch when carrying out the batch processing of interface data file The efficiency of processing.Wherein, which requires to may is that<source system English abbreviation><table name><date, (format was YYYMMDD)><batch number><retransmission sequence number><data providing formula (z be initial, i is increment, and f is full dose)>.dat.Text The coded format of part is UTF8, ASCII value 0x1B other than the separator selection Chinese character set between field.
Due to source system daily can transmission primaries data, the data of multiple batches can also be transmitted, made in the application Indicate current transmission is which time transmission of the same day with batch number.
Optionally, each section is with English input method " point " for separator in name, and be described as follows: source system English abbreviation is The small English character of 3 characters;Table name is known as extracting the title of the table of data, small English word;Date, format are YYYYMMDD is Data Date, even if data offer is overdue, also to use the Data Date of the data rather than current system day Phase;If the interface unit is the moon/year interface, which is the format of MM/YYYY;Batch number is 3 numbers, N+1 Form, be sequentially incremented by since 1 if multiple batches of, shaped like 001;It is 000 if the system only provides single batch data, it is more When batch, first batch number is 001;The interfaces such as the moon/year only support single batch to transmit;Retransmission sequence number is 2 numbers, the shape of N+1 Formula is since 0 sequentially 00 when initial transmission, when data need to transmit again then on the basis of a upper number plus 1, such as: it after initial transmission, finds data quality problem, has carried out the re-transmission of first time data, then the number is 01;Data provide Mode: z be it is initial, i is increment, and f is full dose, small English word.
Such as: cct.t031001.20151010.000.00.z.dat indicates containing for the interface document title of the source system Justice are as follows: the English abbreviation of source system is cct, table name is known as t031001, Data Date be on October 10th, 2015, single batch, just Begin the primary data transmitted.
It should be noted that the application only lists a kind of naming method of the first name requirement herein, specifically answering With in the process, can also according to the actual needs to first name require in field and representation adjusted accordingly Whole, the application does not do any restrictions herein.
S103, the data in the interface data file for meeting the first name requirement are loaded onto data warehouse.
Wherein, the interface data file for meeting the first name requirement needs data file to be loaded as goal systems, can The data in the interface data file for meeting the first name requirement to be loaded into corresponding goal systems, and data warehouse this When relative to source system be goal systems, and the data in the interface data file for meeting the first name requirement are loaded onto data Warehouse.
It should be noted that goal systems can also have multiple, the type of database of multiple goal systems can be identical, Can be different, different goal systems can be corresponded to according to different requirements, so that is generated meets connecing for the first name requirement Mouth data file can use for multiple goal systems, to be different business services.
S104, the data in the interface data file for meeting the first name requirement are integrated and is counted in data warehouse It calculates, the data after obtaining processing.
Wherein, data warehouse is the place that data are integrated and calculated, for example: integration can be with are as follows: deposit has Corresponding deposit client, loan have corresponding loan customer, credit card to have corresponding card holder, and data warehouse can be to these Customer information is integrated (such as pooling together).Calculating can be with are as follows: a certain personal all Assets in bank need The use of funds situation of his deposit under all accounts of bank, loan, credit card is carried out summarizing operation.
In conclusion data processing method provided herein includes, obtain source system data library log file and The database of source system data library connection type, pre-configuration extracts parameter, and target data, therefore, base are extracted in the system table of source In the log file of source system, data pick-up operation for batch operation can be to class database by the parameter list configured The identical source system of type carries out batch data extraction, generates and meets the interface data file that the first name requires, data warehouse can Batch processing is carried out with the data of the interface data file required the name of satisfaction first, so that according to interface data file Naming method interface data files in batch is loaded into corresponding goal systems, so as to avoid ETL work data processing Repetitive work in the process, so that ETL work data treatment effeciency can be improved when executing batch data processing operation Feature reduces development and maintenance cost in ETL work data treatment process.
Fig. 2 is the flow diagram for the data processing method that another embodiment of the application provides, as shown in Fig. 2, this method Further include following steps:
S105, using data warehouse as source system, according to the log file of data warehouse, data warehouse connection type, pre- The database of configuration extracts parameter, and target data is extracted in data warehouse.
Wherein, it includes: the mark of data warehouse and the mark of data warehouse table that the database of pre-configuration, which extracts parameter,.
Wherein, the data source of Data Mart is in the downstream of data warehouse in data warehouse in data flow, so At this time relative to Data Mart, data warehouse is source system, and for data warehouse, Data Mart is target system System.
S106, the interface data file for meeting the first name requirement is generated according to target data.
S107, the data in the interface data file for meeting the first name requirement are loaded onto Data Mart.
Optionally, step S101 includes: the connection side of log file according to source system data library, source system data library Formula, the database extraction parameter of pre-configuration, multiple processes concurrently extract target data in multiple source system tables, wherein one Process corresponds to a source system table.
Wherein, complete pre-configuration database extract parameter after, by database extract parameter in content it is found that When carrying out data pick-up to source system data by the way of parameter configuration, source system identical for type of database passes through The mode of parameter configuration can concurrently extract target data to multiple processes in the source system table, wherein a process simultaneously A corresponding source system table, avoids in the prior art, carries out data pumping to multiple source system tables in same source system data library It when taking, needs to develop multiple data pick-up tasks, and then avoids the data pick-up task exploitation of a large amount of repeatability, improve out Hair efficiency reduces maintenance cost.
Such as: in the prior art, when carrying out the data pick-up of 300 tables using ETL tool to source system, it is necessary to be every It opens table and configures an extraction task, so that 300 tables 300 data pick-up tasks of corresponding configuration.
And in the application, the content in parameter list is extracted by database it is found that in such a way that database extracts parameter list When carrying out data pick-up to source system data, all data pick-up task records are extracted in parameter list in a database, Type of database, system abbreviation in parameter list by field value differentiation source system etc. can be extracted in database, and then be based on The database extracts parameter list and carries out data pick-up to source system table, wherein different type of database can choose not Same data extraction module, such as: one to two data extractors, source can be developed to different types of source system data library The type of database of system be oracle database when, no matter the source system number how many, the class database can be set The data extractor of type be A, and the type of database of source system be MySQL database when, no matter the source system number have it is more Few, the data extractor that the type of database can be set is B, so that the source system to disparate databases type counts When according to extracting, parameter can be extracted according to the database of configuration to source system table and carrying out corresponding data pick-up.Therefore, the application In by database extract parameter list in content configured accordingly, allow to same type of source system data Library type carries out the data pick-up task of multiple tables, has kept away a large amount of repetitive operation, it is possible to reduce batch data extracts task Workload, improve the working efficiency of batch processing.
Optionally, the database of pre-configuration extracts parameter further include: the Data Date information of source system table and extraction side Formula mark, wherein extraction mode identifies instruction increment extraction or full dose extracts.
In a kind of possible embodiment, the database of pre-configuration extract parameter content include: source system mark, The mark of source system table, the Data Date information and extraction mode of source system table identify, wherein it is complete that extraction mode identifies instruction Amount extracts or increment extraction can configure accordingly the parameter, according to the job task actually extracted in addition, according to reality The demand on border, the content that can also be extracted to the database in parameter list are increased or are deleted, and the application is not in the parameter The number and classification of appearance carry out any restriction.
Wherein, data pick-up mode includes that increment extraction or full dose extract, correspondingly, according to stepping on for source system data library File, source system data library connection type, the database extraction parameter of pre-configuration are recorded, extracts target data in the system table of source, It include: that parameter is extracted according to the database of the log file in source system data library, source system data library connection type, pre-configuration, The corresponding increment interface data file of Data Date information or full dose interface data of source system table are extracted in the system table of source File.
Wherein, what increment extraction referred to extraction is changed, newly generated data in preset time;Full dose extraction refers to pumping What is taken is the last state of all data of preset time point, which is per hour, daily, monthly etc. that the application is not The duration of preset time is defined, specific preset time is determined according to the definition of source system.
Such as: incremental data can be according to daily and monthly counting, including every daily increment, monthly increment.Every daily increment can To be the snapshot for extracting changed, the new last state for generating data of daily 00:00 to 24:00.Monthly increment can be Begin that last day 24:00 does not stop to the moon when extracting monthly 00:00 on the 1st, changed, newly generated data last state it is fast According to.
Full dose data can be according to a certain regular time point in daily or a certain regular time point counts per month, packet Include daily full dose, monthly full dose.When daily full dose can refer to extraction daily 24:00, the last state snapshot of all data.Often Month full dose can be extraction per month in and month out not last day 24:00 when, the last state snapshot of all data.
Therefore, interface data file for different data pick-up modes that correspondence is different.For example, using increment extraction When carrying out data pick-up to source system table, generation is increment interface data file, and is extracted with full dose and carried out to source system table When data pick-up, generation is full dose interface data file.
Optionally, step S103 includes: log file according to data warehouse, data warehouse connection type, will meet The data in interface data file that one name requires are loaded onto the interface table for meeting the second name requirement in data warehouse, In, the interface table that the second name requires includes that the database being pre-configured extracts parameter.
Wherein, the data in the interface data file for meeting the first name requirement are loaded onto data warehouse and meet second It names in desired interface table, the interface table for being loaded onto data warehouse requires the database comprising being pre-configured to take out according to the second name Take parameter, wherein it includes: the mark of source service system and the mark of source system table that the database of pre-configuration, which extracts parameter,.
Optionally, step S104 further include: according to the log file of data warehouse, data warehouse connection type, data bins Database-name in library, the job file for calling preset language to write.
Operation job file is integrated and is calculated to the data in the interface data file for meeting the first name requirement, is obtained Data after taking processing.
It should be noted that the preset language in the application is SQL statement, SQL statement is not related to any type database, It is called by frame-type program, just runs all support SQL statements by calling SQL statement that can not need any rewriting Database (such as moves to the Hive of Hadoop) when replacing the data medium of data warehouse from the MaxCompute of Ali, only Need to change frame-type program link information (such as: data warehouse connection type, Data Warehouse library name, data The log file etc. in warehouse).
Wherein, believed by the log file of data warehouse, the connection type of data warehouse, data warehouse data library name etc. Breath, the job file for calling preset language to write, in the interface data file for meeting the first name requirement in data warehouse Data integrated and calculated, can be reduced ETL developer and the database technical ability of carrying data warehouse function wanted It asks, while promoting the cross-platform transplantability of ETL operation on data warehouse.
Fig. 3 is a kind of structural schematic diagram of data processing equipment provided by the present application.As shown in figure 3, the device includes: to take out Modulus block 210, generation module 220, loading module 230 and acquisition module 240.
Abstraction module 210, for according to the log file in source system data library, source system data library connection type, prewired The database set extracts parameter, extracts target data in the source system table, wherein the database of pre-configuration extracts parameter packet It includes: the mark of source system and the mark of source system table.
Generation module 220, for generating the interface data file for meeting the first name requirement according to target data.
Loading module 230, for the data in the interface data file for meeting the first name requirement to be loaded onto data bins Library.
Obtain module 240, in data warehouse to the data in the interface data file for meeting the first name requirement It is integrated and is calculated, the data after obtaining processing.
The method that above-mentioned apparatus is used to execute previous embodiment offer, it is similar that the realization principle and technical effect are similar, herein not It repeats again.
Optionally, abstraction module 210 is specifically used for: according to the log file in source system data library, source system data library Connection type, the database extraction parameter of pre-configuration, multiple processes concurrently extract target data in multiple source system tables, Wherein, the corresponding source system table of a process.
Fig. 4 is the structural schematic diagram for the electronic equipment that an embodiment provided by the present application provides.As shown in figure 4, the equipment It include: processor 310, storage medium 320 and bus 330, storage medium 320 is stored with the executable machine of processor 310 can Reading instruction is communicated between processor 310 and storage medium 320 by bus 330, processor 310 is held when electronic equipment operation Row machine readable instructions, the step of above-mentioned data processing method is executed when executing.
Optionally, the present invention also provides a kind of storage medium, it is stored with computer program on the storage medium, the computer The step of above-mentioned data processing method is executed when program is run by processor.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) or processor (English: processor) execute this hair The part steps of bright each embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (English: Read-Only Memory, abbreviation: ROM), random access memory (English: Random Access Memory, letter Claim: RAM), the various media that can store program code such as magnetic or disk.

Claims (10)

1. a kind of data processing method characterized by comprising
Parameter is extracted according to the database of the log file in source system data library, source system data library connection type, pre-configuration, Target data is extracted in the system table of source, wherein the database of pre-configuration extracts the mark and source system table that parameter includes: source system Mark;
The interface data file for meeting the first name requirement is generated according to target data;
Data in the interface data file for meeting the first name requirement are loaded onto data warehouse;
The data in the interface data file for meeting the first name requirement are integrated and counted in the data warehouse It calculates, the data after obtaining processing.
2. the method as described in claim 1, which is characterized in that the method also includes:
Using data warehouse as source system, according to the log file of data warehouse, data warehouse connection type, pre-configuration Database extracts parameter, extracts target data in the data warehouse, wherein the database of pre-configuration extracts parameter and includes: The mark of data warehouse and the mark of data warehouse table;
The interface data file for meeting the first name requirement is generated according to the target data;
Data in the interface data file for meeting the first name requirement are loaded onto Data Mart.
3. the method as described in claim 1, which is characterized in that the log file according to source system data library, source system Database connection type, the database extraction parameter of pre-configuration, extract target data in the source system table, comprising:
Parameter is extracted according to the database of the log file in source system data library, the connection type in source system data library, pre-configuration, Multiple processes concurrently extract target data in multiple source system tables, wherein the corresponding source of a process System table.
4. the method as described in claim 1, which is characterized in that the database of the pre-configuration extracts parameter further include: source system The Data Date information and extraction mode of system table identify, wherein the extraction mode identifies instruction increment extraction or full dose is taken out It takes;
It is described to be taken out according to the database of the log file in the source system data library, source system data library connection type, pre-configuration Parameter is taken, extracts target data in the source system table, comprising:
It is extracted and is joined according to the database of the log file in the source system data library, source system data library connection type, pre-configuration Number extracts the corresponding increment interface data file of Data Date information or complete of the source system table in the source system table Measure interface data file.
5. the method as described in claim 1, which is characterized in that described by the interface data text for meeting the first name requirement Data in part are loaded onto data warehouse, comprising:
According to the log file of data warehouse, data warehouse connection type, by the interface data for meeting the first name requirement Data in file are loaded onto the interface table for meeting the second name requirement in the data warehouse, wherein second name It is required that interface table include the pre-configuration database extract parameter.
6. the method as described in claim 1, which is characterized in that described to meet the first name to described in the data warehouse It is required that interface data file in data integrated and calculated, obtain processing after data, comprising:
According to the log file of data warehouse, data warehouse connection type, the Data Warehouse library name, call default The job file that language is write;
It runs the job file data in the interface data file for meeting the first name requirement are integrated and counted It calculates, the data after obtaining processing.
7. a kind of data processing equipment characterized by comprising abstraction module, loading module, obtains module at generation module, In:
The abstraction module, for according to the log file in source system data library, source system data library connection type, be pre-configured Database extracts parameter, extracts target data in the system table of source, wherein it includes: source system that the database of pre-configuration, which extracts parameter, The mark of system and the mark of source system table;
The generation module, for generating the interface data file for meeting the first name requirement according to the target data;
The loading module, for the data in the interface data file for meeting the first name requirement to be loaded onto data bins Library;
The acquisition module, in the data warehouse in the interface data file for meeting the first name requirement Data are integrated and are calculated, the data after obtaining processing.
8. device as claimed in claim 7, which is characterized in that the abstraction module is specifically used for: according to source system data library Log file, the connection type in source system data library, pre-configuration database extract parameter, in multiple source system tables Multiple processes concurrently extract target data, wherein the corresponding source system table of a process.
9. a kind of electronic equipment characterized by comprising processor, storage medium and bus, storage medium are stored with processor Executable machine readable instructions pass through bus communication, processor when electronic equipment operation between processor and storage medium Machine readable instructions are executed, the step of to execute data processing method described in the claims 1-6.
10. a kind of storage medium, which is characterized in that be stored with computer program on the storage medium, the computer program quilt The step of data processing method described in the claims 1-6 is executed when processor is run.
CN201910221391.2A 2019-03-22 2019-03-22 Data processing method, device, electronic equipment and storage medium Pending CN109960708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910221391.2A CN109960708A (en) 2019-03-22 2019-03-22 Data processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910221391.2A CN109960708A (en) 2019-03-22 2019-03-22 Data processing method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109960708A true CN109960708A (en) 2019-07-02

Family

ID=67024730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910221391.2A Pending CN109960708A (en) 2019-03-22 2019-03-22 Data processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109960708A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110515995A (en) * 2019-08-22 2019-11-29 深圳前海环融联易信息科技服务有限公司 Quickly generate the ETL operational method and device of big data platform
CN111078777A (en) * 2019-12-13 2020-04-28 紫光云(南京)数字技术有限公司 Method for loading data based on dynamic increment of relational database
CN111291025A (en) * 2020-03-10 2020-06-16 北京东方金信科技有限公司 Method for supporting multi-physical model conversion by logic model and storage device
CN111324647A (en) * 2020-01-21 2020-06-23 北京东方金信科技有限公司 Method and device for generating ETL code
CN111798311A (en) * 2020-07-22 2020-10-20 睿智合创(北京)科技有限公司 Bank risk analysis library platform based on big data, building method and readable medium
CN112559611A (en) * 2020-12-15 2021-03-26 中国人寿保险股份有限公司 Data processing method, device, equipment and storage medium
CN113312357A (en) * 2021-06-23 2021-08-27 中国农业银行股份有限公司 Data loading method, device, equipment and storage medium
CN114816578A (en) * 2022-05-11 2022-07-29 上海柯林布瑞信息技术有限公司 Method, device and equipment for generating program configuration file based on configuration table

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452487A (en) * 2008-12-31 2009-06-10 中国建设银行股份有限公司 Data loading method and system, and data loading unit
US20090282058A1 (en) * 2008-05-12 2009-11-12 Expressor Software Method and system for developing data integration applications with reusable functional rules that are managed according to their output variables
CN102073698A (en) * 2010-12-28 2011-05-25 中国工商银行股份有限公司 Sample data acquisition method and device for enterprise data warehouse system
CN103197960A (en) * 2013-04-12 2013-07-10 中国银行股份有限公司 Scheduling method and scheduling system for batch job system
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses
CN104734894A (en) * 2013-12-18 2015-06-24 中国移动通信集团甘肃有限公司 Flow data screening method and device
CN107784026A (en) * 2016-08-31 2018-03-09 杭州海康威视数字技术股份有限公司 A kind of ETL data processing methods and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090282058A1 (en) * 2008-05-12 2009-11-12 Expressor Software Method and system for developing data integration applications with reusable functional rules that are managed according to their output variables
CN101452487A (en) * 2008-12-31 2009-06-10 中国建设银行股份有限公司 Data loading method and system, and data loading unit
CN102073698A (en) * 2010-12-28 2011-05-25 中国工商银行股份有限公司 Sample data acquisition method and device for enterprise data warehouse system
CN103197960A (en) * 2013-04-12 2013-07-10 中国银行股份有限公司 Scheduling method and scheduling system for batch job system
CN104734894A (en) * 2013-12-18 2015-06-24 中国移动通信集团甘肃有限公司 Flow data screening method and device
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses
CN107784026A (en) * 2016-08-31 2018-03-09 杭州海康威视数字技术股份有限公司 A kind of ETL data processing methods and device

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110515995A (en) * 2019-08-22 2019-11-29 深圳前海环融联易信息科技服务有限公司 Quickly generate the ETL operational method and device of big data platform
CN111078777A (en) * 2019-12-13 2020-04-28 紫光云(南京)数字技术有限公司 Method for loading data based on dynamic increment of relational database
CN111324647A (en) * 2020-01-21 2020-06-23 北京东方金信科技有限公司 Method and device for generating ETL code
CN111291025A (en) * 2020-03-10 2020-06-16 北京东方金信科技有限公司 Method for supporting multi-physical model conversion by logic model and storage device
CN111291025B (en) * 2020-03-10 2020-11-10 北京东方金信科技有限公司 Method for supporting multi-physical model conversion by logic model and storage device
CN111798311A (en) * 2020-07-22 2020-10-20 睿智合创(北京)科技有限公司 Bank risk analysis library platform based on big data, building method and readable medium
CN112559611A (en) * 2020-12-15 2021-03-26 中国人寿保险股份有限公司 Data processing method, device, equipment and storage medium
CN113312357A (en) * 2021-06-23 2021-08-27 中国农业银行股份有限公司 Data loading method, device, equipment and storage medium
CN114816578A (en) * 2022-05-11 2022-07-29 上海柯林布瑞信息技术有限公司 Method, device and equipment for generating program configuration file based on configuration table
CN114816578B (en) * 2022-05-11 2024-05-17 上海柯林布瑞信息技术有限公司 Program configuration file generation method, device and equipment based on configuration table

Similar Documents

Publication Publication Date Title
CN109960708A (en) Data processing method, device, electronic equipment and storage medium
CN109814856B (en) Data entry method, device, terminal and computer readable storage medium
CN100478956C (en) Method and corresponding system for creating and obtaining report forms
US7464120B1 (en) Method for analyzing the quality of telecommunications switch command tables
CN111488145B (en) Micro-service code generation system and method based on service domain data model library
US20040039623A1 (en) Workflow management software overview
US9704172B2 (en) Systems and methods of simulating user intuition of business relationships using biographical imagery
US6993717B2 (en) Data transformation system
US7895094B2 (en) Global account reconciliation tool
CN110119294A (en) The generation method of menu page, apparatus and system
CN107506383B (en) Audit data processing method and computer equipment
CN107273122A (en) Based on decoupling mechanism can iteration set up operation system method and its terminal
WO2000007128A1 (en) A modular, convergent customer care and billing system
US8036961B2 (en) Dynamically managing timesheet data associated with multiple billing types
CN110162524A (en) Management method, device, computer equipment and the storage medium of configuration data
CN110019437A (en) A kind of method and system exporting data
CN109598631B (en) Method and system for generating human resource outsourcing client bill based on social security policy
CN109582446A (en) Quasi real time asynchronous batch processing system, method, apparatus and storage medium
CN115098047A (en) Printing platform based on Word template and functional plug-in
CN109685636A (en) Wage number generation method, device, system and computer storage medium
US20120078967A1 (en) Integration of a Framework Application and a Task Database
CN100395752C (en) Report data collection system and method
US20050097122A1 (en) Redundancy-free provision of multi-purpose data
CN108153845B (en) Method and device for exporting background image data
CN110111203A (en) Batch process, device and the electronic equipment of business datum

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190702