CN116050235A - Workflow data layout method under cloud side environment and storage medium - Google Patents
Workflow data layout method under cloud side environment and storage medium Download PDFInfo
- Publication number
- CN116050235A CN116050235A CN202310176231.7A CN202310176231A CN116050235A CN 116050235 A CN116050235 A CN 116050235A CN 202310176231 A CN202310176231 A CN 202310176231A CN 116050235 A CN116050235 A CN 116050235A
- Authority
- CN
- China
- Prior art keywords
- data
- copy
- particles
- data center
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 239000002245 particle Substances 0.000 claims abstract description 173
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 82
- 230000005540 biological transmission Effects 0.000 claims abstract description 34
- 230000002068 genetic effect Effects 0.000 claims abstract description 34
- 230000035772 mutation Effects 0.000 claims abstract description 28
- 230000008569 process Effects 0.000 claims abstract description 28
- 238000005457 optimization Methods 0.000 claims abstract description 19
- 230000001133 acceleration Effects 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 15
- 230000010076 replication Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 208000011597 CGF1 Diseases 0.000 claims description 3
- 230000019771 cognition Effects 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 238000009877 rendering Methods 0.000 claims description 3
- 230000004039 social cognition Effects 0.000 claims description 3
- 230000002028 premature Effects 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 9
- 230000009286 beneficial effect Effects 0.000 description 3
- 240000006829 Ficus sundaica Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/25—Design optimisation, verification or simulation using particle-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Evolutionary Biology (AREA)
- Geometry (AREA)
- Computer Hardware Design (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method and a storage medium for workflow data layout in a cloud edge environment, which are used for carrying out mathematical representation on the cloud edge environment, generating cost and data transmission cost based on copies, and modeling a data layout problem as a 0-1 integer programming problem with the aim of minimizing total time delay to obtain a mathematical problem model; adopting a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator, introducing a crossover operator and a mutation operator of the genetic algorithm into the particle swarm algorithm, and adaptively adjusting the inertial weight according to the difference between particles and global particles so as to solve the mathematical problem model; carrying out workflow data layout according to the solving result; time delay can be effectively reduced; and the crossover and mutation operators of the genetic algorithm are introduced into the particle swarm algorithm, so that the searching capability of the particle swarm algorithm is enhanced, premature convergence is avoided, and the inertia weight is adaptively adjusted according to the difference between the current particle and the global particle, so that the optimizing process is more efficient.
Description
Technical Field
The invention relates to the technical field of workflow data layout, in particular to a method and a storage medium for workflow data layout in a cloud edge environment.
Background
Workflow models are an effective method for describing business processes, and consist of a plurality of interrelated tasks, and workflow is commonly used in astronomy, physics, bioinformatics and other scientific fields. As a data intensive application, deployment of scientific workflows places stringent demands on the computing power and storage capacity of the environment.
Cloud computing has strong storage and computing capabilities, provides personalized services for users, and ensures resource supply of scientific workflow. However, the operation of the scientific workflow is accompanied by large-scale data transmission, and the cloud computing deployed at the far end can cause serious data transmission delay. The edge calculation moves the calculation to the edge of the network edge close to the position of the user, so that the transmission delay of data can be reduced, and the privacy data of the user can be stored. But the edge computing resources are limited and cannot store all the data needed and generated when the scientific workflow is executed. Cloud computing and edge computing are combined, and a safe and efficient mode can be provided for deployment of scientific workflow.
Due to the existence of private data, a large amount of data transmission can be performed during the execution of the scientific workflow, which causes serious time delay. With the reduction of the storage cost, the data copy is frequently used in cloud computing and edge computing, and the data transmission times can be reduced by accessing the copy nearby. However, the layout of the data copy in the cloud environment has many challenges, and in particular, the generation, transmission and storage of the copy are accompanied by overhead, so that a proper amount of copy needs to be generated by selecting proper data, and the position of the copy layout is difficult to select.
Therefore, how to layout the data copies to reduce latency is important.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: the method and the storage medium for workflow data layout in cloud environment can effectively reduce time delay.
In order to solve the technical problems, the invention adopts the following technical scheme:
a method for workflow data layout in cloud-edge environment comprises the following steps:
s1, carrying out mathematical representation on a cloud edge environment, and modeling a data layout problem as a 0-1 integer programming problem based on copy generation cost and data transmission cost with the aim of minimizing total time delay to obtain a mathematical problem model;
s2, adopting a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator, introducing a crossover operator and a mutation operator of the genetic algorithm into the particle swarm algorithm, and adaptively adjusting the inertial weight according to the difference between particles and global particles so as to solve the mathematical problem model;
and S3, carrying out workflow data layout according to the solving result.
In order to solve the technical problems, the invention adopts another technical scheme that:
a storage medium having stored thereon a computer program which when executed performs the steps of a method of workflow data layout in a cloud-edge environment as described above.
The invention has the beneficial effects that: according to the method and the storage medium for workflow data layout in the cloud edge environment, the data copy layout is modeled into 0-1 integer programming problem with the aim of minimizing total time delay, and a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator is adopted to solve the data layout problem and effectively reduce the time delay; and the crossover and mutation operators of the genetic algorithm are introduced into the particle swarm algorithm, so that the searching capability of the particle swarm algorithm is enhanced, premature convergence is avoided, and the inertia weight is adaptively adjusted according to the difference between the current particle and the global particle, so that the optimizing process is more efficient.
Drawings
Fig. 1 is a schematic diagram of a scientific workflow example of a method for workflow data layout in a cloud-edge environment according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an example of a workflow data layout in a cloud-edge environment;
FIG. 3 is a schematic diagram of an example one-dimensional encoding of a data layout of a method for workflow data layout in a cloud-edge environment according to an embodiment of the present invention;
fig. 4 is a schematic diagram of two-dimensional encoding example of a data layout of a method for workflow data layout in a cloud-edge environment according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a mutation operator example of a method for workflow data layout in a cloud-edge environment according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an example of a cross operator of a method for workflow data layout in a cloud-edge environment according to an embodiment of the present invention;
fig. 7 is a flowchart of a method for workflow data layout in a cloud-edge environment according to an embodiment of the present invention.
Detailed Description
In order to describe the technical contents, the achieved objects and effects of the present invention in detail, the following description will be made with reference to the embodiments in conjunction with the accompanying drawings.
Referring to fig. 1, 2 and 4 to 7, a method for workflow data layout in a cloud environment includes the steps of:
s1, carrying out mathematical representation on a cloud edge environment, and modeling a data layout problem as a 0-1 integer programming problem based on copy generation cost and data transmission cost with the aim of minimizing total time delay to obtain a mathematical problem model;
s2, adopting a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator, introducing a crossover operator and a mutation operator of the genetic algorithm into the particle swarm algorithm, and adaptively adjusting the inertial weight according to the difference between particles and global particles so as to solve the mathematical problem model;
And S3, carrying out workflow data layout according to the solving result.
From the above description, the beneficial effects of the invention are as follows: according to the method and the storage medium for workflow data layout in the cloud edge environment, the data copy layout is modeled into 0-1 integer programming problem with the aim of minimizing total time delay, and a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator is adopted to solve the data layout problem and effectively reduce the time delay; and the crossover and mutation operators of the genetic algorithm are introduced into the particle swarm algorithm, so that the searching capability of the particle swarm algorithm is enhanced, premature convergence is avoided, and the inertia weight is adaptively adjusted according to the difference between the current particle and the global particle, so that the optimizing process is more efficient.
Further, in step S1, the mathematical representation of the cloud-edge environment is specifically:
the cloud-edge environment is expressed as:
S={S cld ,S edg };
wherein, cloud computing S cld Comprising j data centers, denoted as:
S cld ={s 1 ,s 2 ,…,s j };
edge computation S edg Comprising k data centers, denoted as:
S edg ={s j+1 ,s j+2 ,…,s j+k };
each data center s i Expressed as:
s i =<c i ,γ i ,a i >;
wherein ,ci Representing its storage capacity, gamma i Representing data center type, gamma i ∈{0,1},γ i =0 represents that the data center is a cloud data center, and only public data, gamma, can be stored i The data center is denoted by 1 and can store public data and private data with fixed storage positions, a i Representing the speed at which the data center replicates data;
the network bandwidth between data centers is expressed as:
wherein ,bij Representing a data center s i And data center s j Is a bandwidth of (a);
the scientific workflow is expressed as:
G=(V,E,D);
wherein V represents a set of tasks in a scientific workflow:
V={v 1 ,v 2 ,…,v w };
e represents a set of task dependencies in a scientific workflow:
d represents a set of data replicas:
D={d 1 ,d 2 ,…,d m };
each task v i The relevant dataset is represented as<D i ,D o >,D i Representing its input dataset, D o Representing an output dataset thereof, the input dataset and the output dataset each consisting of one or more data, the inter-task dependencies e ij E, represent task v j Is task v i Is required at task v i Can be executed after completion, otherwise task v j For task v i Without dependence, each data copy set d i Comprising several copies of the ith data, d ij Representing it as the j-th copy of the i-th data, d i1 And (3) rendering the ith original data, each data copy containing attributes<z i1 ,n i1 ,f i1 ,l i1 >,z i1 Representing data size, n i1 Representing the number of copies of the data, is an integer greater than 0If n i1 =1, then indicates that the data has no other copies, f i1 Representing the generated data d i1 If the data is the initial data, f i1 Is marked as 0,l i1 Record data d ij If data d ij Is privacy data, then l i1 Recording the data center to which the data center belongs, and if the data center is public data, l i1 Is 0.
From the above description, the cloud edge environment is mathematically represented through the above steps.
Further, in step S3, modeling of the digital problem model is specifically:
data d i1 In data center s k Copy overhead t of (2) copy The method comprises the following steps:
wherein ,zi1 Is data d ij Size of a), a k Is a data center s k The speed at which the data is copied;
data d ij From data centre s k Transmitted to data center s l Is the transmission overhead t of (2) tran The method comprises the following steps:
wherein bkl Is a data center s k And data center s l If the copy is copied and laid out to the current data center, no transmission overhead exists;
the data layout is expressed as { S, D, Y, T } total S is a data center set, D is a data set, Y is a layout position set of data, and all data D ij E D, all correspond to unique data centers:
T total for the total time delay corresponding to the data layout scheme, the data copying time T copy And data transmission time T tran And (2) sum:
T total =T copy +T tran ;
data replication time T copy Expressed as:
data transmission time T tran Expressed as:
where h (i, j, k, l) ∈ {0,1}, h (i, j, k, l) =1 represents the l-th copy d of data k kl The slave data center s exists i To data centre s j Otherwise h (i, j, k, l) =0;
the targets of the data layout strategy are expressed as:
where β (i, j, k) ∈ {0,1}, β (i, j, k) =1 represents that the kth copy of data j is stored on data center i.
From the above description, it can be seen that, through the above steps, a mathematical problem model of a data layout strategy is obtained that aims to minimize the total delay.
Further, the step S3 includes the steps of:
encoding the data layout strategy by adopting a two-dimensional array to construct candidate particles:
each bitRepresenting the copy set storage location of data j for the ith particle in the t-th iteration: />
wherein ,qk ∈{0,1},q k =1 indicates that a copy of data j is laid out on data center k, otherwise indicates that a copy of data j is not laid out on data center k, x tij Middle q k The number of=1 represents the number of copies of data j.
From the above description, two problems should be considered with the use of data replicas: (1) How to represent different copies of the data, (2) how to represent the storage locations of the copies of the data; the above steps solve both problems, giving attention to completeness and non-redundancy.
Further, solving the mathematical problem model according to the nonlinear inertial weight discrete particle swarm optimization algorithm based on the genetic algorithm operator comprises the following steps:
analyzing the scientific workflow, and performing topological ordering on the tasks to obtain a task queue capable of being sequentially executed;
initializing the maximum capacity of a data center, generating an initialization population according to a privacy data set, wherein privacy data in the initialization population can be laid out on the corresponding data center, and public data is randomly laid out without generating other copies;
simulating a data layout process, judging whether particles are feasible solutions, if so, calculating total time delay, and if not, recording an unreliable data set;
setting all individuals in the initial population as the optimal individual history, setting the optimal population history as the particles with the best fitness in the initial population, and calculating the fitness of the particles;
iterating the population, mutating the population according to the inertia weight factor w, and accelerating the population according to the acceleration factor alpha 1 Crossing the population with the optimal population of the individual history according to the acceleration factor alpha 2 Crossing the population with the population history optimum, calculating the adaptability of the new population, and updating the global information;
and outputting the total time delay with optimal population history when the iteration is finished.
From the above description, it can be seen that the data copy layout strategy based on NPSO-GA is realized according to the above steps.
Further, wherein the calculating of the fitness includes:
based on the comparison of fitness values F for both types of particles, a fitness function is established:
both particles compared are feasible solutions, and the particle fitness with lower total delay is better, and the fitness function is defined as follows:
F=T total ;
both particles compared are not feasible solutions, then the data set D is not resolvable inf The smaller length particles have better fitness, which means that more data is laid out in feasible locations, and become feasible solution particles in subsequent iterations more easily, and the fitness function is as follows:
F=|D inf |;
if the feasible solution particles and the infeasible solution particles are compared, selecting a feasible solution, wherein the fitness function is as follows:
from the above description, it can be seen that, since the encoding of the data layout strategy of the present invention is not robust, infeasible solution particles are generated, and thus different adaptations need to be defined according to different situations.
Further, the data layout process includes the steps of:
initializing a task position list for recording the execution positions of all tasks and an overrun mark for recording whether a data center exceeds the capacity limit of the data center in the task execution process;
Calculating the capacity condition of the data center after the initial data set is subjected to data layout, traversing the task queue, calculating the execution position of the task and recording the execution position of the task into a task position list;
when a task generates an output data set, temporarily storing the input data set and the output data set of the task on a data center, judging whether the data center exceeds capacity limit at the moment, then distributing the output data of the task on the data center designated by the task, and updating the capacity of the data center;
if the data center exceeds the capacity limit in the process of executing the task, recording the data distributed on the data center exceeding the capacity limit in the insoluble data set D inf And if not, calculating and recording the total time delay.
As is apparent from the above description, the data layout process is realized through the above steps.
Further, in the nonlinear inertial weight discrete particle swarm optimization algorithm based on the genetic algorithm operator in the step S3, introducing the crossover and mutation operator of the genetic algorithm into the particle swarm algorithm comprises the steps of:
iterating the velocity and position of the particles:
the ith update policy for the ith particle is:
wherein ,Cg and Cp Is a crossover operator, M u Is a mutation operator, which is used for the mutation of the original data,is the individual history of particle i at the t-th iteration is optimal, g t Is the optimal population history at the t iteration, alpha 1 、α 2 And w is between 0 and 1, representing an acceleration factor and an inertial weight factor;
replacing an inertia part in the particle swarm algorithm by adopting a mutation operator of the genetic algorithm:
generating a random number r between 0 and 1 w If it is smaller than the inertial weight factor w, the particles undergo mutation:
acquisition of an insoluble data set D of particles X i inf From the insoluble dataset D inf And a privacy dataset D fix Obtaining a variation position:
if D inf If there is no data, choose not to be at D fix Bit of a data correspondence of D inf If there is data in the list, select D inf The common data of (a) is divided into bits;
counting the copy number of the data corresponding to the position to be mutated of the statistical particles X i, if
X i [muIndex][j]=1;
Then it indicates that the data corresponding to the position to be mutated of particle X i has a copy on data center j;
updating the copy number copy count, increasing or decreasing the copy number copy count according to probability based on the original number, and ensuring that at least one copy exists and the copy number copy count does not exceed the number of the data center;
is particle X i Generating a data copy layout scheme with copy number of copy count at the position to be mutated;
the individual cognition and social cognition parts in the particle swarm algorithm are replaced by adopting a crossover operator of the genetic algorithm:
wherein ,representing optimal crossing of particles and individual history, +.>Representing the optimal intersection of particles and population history.
From the above description, by introducing the crossover and mutation operators of the genetic algorithm into the particle swarm algorithm, the searching capability of the particle swarm algorithm is enhanced, and premature convergence is avoided.
Further, in the nonlinear inertial weight discrete particle swarm optimization algorithm based on the genetic algorithm operator in the step S3, the step of adaptively adjusting the inertial weight according to the difference between the particles and the global particles includes the steps of:
the strategy of nonlinear adjustment of the inertia weight is adopted, and the inertia weight is adjusted based on the difference degree of the current particle and the global particle:
adjusting acceleration factor alpha using a linear variation strategy 1 and α2 :
From the above description, it can be seen that through the above steps, the inertia weight is adaptively adjusted according to the difference between the current particle and the global particle, so that the optimizing process is more efficient.
A storage medium having stored thereon a computer program which when executed performs the steps of a method of workflow data layout in a cloud-edge environment as described above.
The method and the storage medium for arranging the workflow data in the cloud environment are suitable for arranging the workflow data in the cloud environment.
Referring to fig. 1 to 7, a first embodiment of the present invention is as follows:
a method for workflow data layout in cloud-edge environment comprises the following steps:
s1, carrying out mathematical representation on a cloud edge environment, and modeling a data layout problem as a 0-1 integer programming problem based on copy generation cost and data transmission cost with the aim of minimizing total time delay to obtain a mathematical problem model;
in step S1, the mathematical representation of the cloud edge environment is specifically:
the cloud-edge environment is expressed as:
S={S cld ,S edg };
wherein, cloud computing S cld Comprising j data centers, denoted as:
S cld ={s 1 ,s 2 ,…,s j };
edge computation S edg Comprising k data centers, denoted as:
S edg ={s j+1 ,s j+2 ,…,s j+k };
each data center s i Expressed as:
s i =<c i ,γ i ,a i >;
wherein ,ci Representing its storage capacity, gamma i Representing data center type, gamma i ∈{0,1},γ i =0 represents that the data center is a cloud data center, and only public data, gamma, can be stored i The data center is denoted by 1 and can store public data and private data with fixed storage positions, a i Representing the speed at which the data center replicates data;
the network bandwidth between data centers is expressed as:
wherein ,bij Representing a data center s i And data center s j Is a bandwidth of (a);
the scientific workflow is expressed as:
G=(V,E,D);
wherein V represents a set of tasks in a scientific workflow:
V={v 1 ,v 2 ,…,v w };
e represents a set of task dependencies in a scientific workflow:
d represents a set of data replicas:
D={d 1 ,d 2 ,…,d m };
tasksIs a unit that can perform calculations in a data center, using data sets as inputs to perform tasks and generate new data sets in a certain order of execution. Each task v i The relevant dataset is represented as<D i ,D o >,D i Representing its input dataset, D o Representing an output dataset thereof, the input dataset and the output dataset each consisting of one or more data, the inter-task dependencies e ij E, represent task v j Is task v i Is required at task v i Can be executed after completion, otherwise task v j For task v i Without dependence, each data copy set d i Comprising several copies of the ith data, d ij Representing it as the j-th copy of the i-th data, d i1 And (3) rendering the ith original data, each data copy containing attributes<z i1 ,n i1 ,f i1 ,l i1 >,z i1 Representing data size, n i1 Representing the number of copies of the data, which is an integer greater than 0, if n i1 =1, then indicates that the data has no other copies, f i1 Representing the generated data d i1 If the data is the initial data, f i1 Is marked as 0,l i1 Record data d ij If data d ij Is privacy data, then l i1 Recording the data center to which the data center belongs, and if the data center is public data, l i1 Is 0.
Different copies can be laid out on different data centers so as to shorten the data transmission delay, and if the data is private data, the copies can not be generated. The use of data copies creates additional overhead, including data copy overhead t copy And data copy transmission overhead t tran And at the same time, the storage resources of the data center are occupied. Data d i1 In data center s k Copy overhead t of (2) copy The method comprises the following steps:
wherein ,zi1 Is a number ofAccording to d ij Size of a), a k Is a data center s k The speed at which the data is copied;
data d ij From data centre s k Transmitted to data center s l Is the transmission overhead t of (2) tran The method comprises the following steps:
wherein bkl Is a data center s k And data center s l And if the copy is copied and laid out to the current data center, no transmission overhead exists.
In this embodiment, copying all common data generates a large amount of overhead, so the number of copies of each data in the present invention is dynamic, and the number of copies of the data is affected by the number of times the data is input as a task. FIG. 1 illustrates a data replication model of the present invention that selectively replicates data in exchange for overhead of generating a replica for transmission overhead, thereby reducing overall latency.
The data layout is expressed as { S, D, Y, T } total S is a data center set, D is a data set, Y is a layout position set of data, and all data D ij E D, all correspond to unique data centers:
before a task in a scientific workflow is executed, all input data required for the task should be transmitted to a data center for executing the task. Because the data volume in the scientific workflow is huge, the task scheduling time is far less than the data transmission time, so the task scheduling time is ignored. T (T) total For the total time delay corresponding to the data layout scheme, the data copying time T copy And data transmission time T tran And (2) sum:
T total =T copy +T tran ;
data replication time T copy Representation ofThe method comprises the following steps:
data transmission time T tran Expressed as:
where h (i, j, k, l) ∈ {0,1}, h (i, j, k, l) =1 represents the l-th copy d of data k kl The slave data center s exists i To data centre s j Otherwise h (i, j, k, l) =0;
the targets of the data layout strategy are expressed as:
where β (i, j, k) ∈ {0,1}, β (i, j, k) =1 represents that the kth copy of data j is stored on data center i.
In this embodiment, the data layout of the scientific workflow shown in fig. 2 is the scientific workflow from fig. 1. The scientific workflow contains a task set v= { V 1 ,v 2 ,v 3 ,v 4 ,v 5 ,v 6 ,v 7 Sum dataset d= { D 1 ,d 2 ,d 3 ,d 4 ,d 5 ,d 6 ,d 7 The data size is {6GB,10GB,4GB,3GB, 5GB,11GB }, wherein the data set is divided into a common data set D flex ={d 2 ,d 6 ,d 7 Sum privacy dataset D fix ={d 1 ,d 3 ,d 4 ,d 5 }. The data center comprises two data units with a capacity of 25GBAn edge data center and a cloud data center with unlimited storage space. Set the bandwidth { b between data centers 12 ,b 13 ,b 23 The data replication speed of the data center was set to 800M/s for {10M/s,20M/s,100M/s } respectively. Privacy data d 1 ,d 3 Laid out on the edge data center 2, the private data d 4 ,d 5 Laid out on the edge data center 3. Since tasks all involve private data as input or output, task v 1 ,v 2 ,v 3 ,v 6 Executing on edge data center 2, task v 4 ,v 5 ,v 7 Is performed on the edge data center 3.
Wherein FIG. 2a and FIG. 2b are two layout schemes, respectively, that do not use a copy of the data, the difference being that the a scheme uses data d 2 Is laid out in the data center 2, while scheme b will be data d 2 Is laid out in the data center 3. Both schemes generate data d 2 Is transmitted across data centers twice and data d 7 Is transmitted across the data center, causing a delay of about 6144 s. FIG. 2c is a diagram illustrating a data layout scheme using dynamic copy number according to the present invention, shown in v 2 Generating data d 2 When the data is copied once, one copy is transmitted to the data center 3, and only the data d is needed 2 Performing one copy and one transmission across data centers and data d 7 Making one transmission across the data center can cause a delay of about 5427 s. In addition, if all the common data are duplicated, unnecessary time overhead is caused, and even the limit of the capacity of the edge data center is exceeded. The invention replaces transmission cost with cost of generating copy on the premise of capacity permission, and reduces total time delay by reasonably using data copy.
S2, adopting a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator, introducing a crossover operator and a mutation operator of the genetic algorithm into the particle swarm algorithm, and adaptively adjusting the inertial weight according to the difference between particles and global particles so as to solve the mathematical problem model.
The overall goal of the data placement strategy is to achieve a mapping of the data set D to the data center S such that the overall latency is minimized, as allowed by the data center capacity. In this embodiment, a data copy layout strategy of a nonlinear inertial weight discrete particle swarm optimization algorithm (Nonlinear inertial weight discrete Particle Swarm Optimization algorithm based on Genetic Algorithm's operators, NPSO-GA) based on genetic algorithm operators is provided, which considers the cost of generating data copies, selectively copies data according to the task requirement, and determines the layout position of the data.
Problem coding:
two problems should be considered with using a copy of the data: (1) How to represent different copies of data, (2) how to represent the storage locations of the copies of data, and problem encoding requires that completeness, non-redundancy and robustness be considered as much as possible.
FIG. 3 is a diagram of a conventional static copy number encoding method, in which the same number of copies (the number of copies is 2 in the diagram) is generated for public data, and a one-dimensional array is used to represent a data layout scheme of a scientific workflow in a cloud environment, wherein each bit represents a layout position of one data copy. The coding scheme has completeness, each candidate solution of the problem space can be coded as a particle, but has no non-redundancy and robustness, such as particle X 1 = (2,2,3,2,3,3,2,3,1,1) and particle X 2 The same solution to the corresponding problem space, = (2,3,2,2,3,3,2,3,1,1), is d for data 2 One copy is made and the two copies are laid out on the data center 2 and the data center 3, respectively. In addition, the number of the copies needs to be determined in advance in the coding mode, and the number of the data copies cannot be adjusted according to the frequency of data use.
In this embodiment, a new coding scheme is proposed, and a two-dimensional array is used to construct candidate solution particles.
The step S3 includes the steps of:
encoding the data layout strategy by adopting a two-dimensional array to construct candidate particles:
Each bitRepresenting the copy set storage location of data j for the ith particle in the t-th iteration:
wherein ,qk ∈{0,1},q k =1 indicates that a copy of data j is laid out on data center k, otherwise indicates that a copy of data j is not laid out on data center k, x tij Middle q k The number of=1 represents the number of copies of data j.
Such a coding scheme has completeness and non-redundancy and may vary the number and location of copies with the iteration of the particle, the data layout of fig. 2c corresponds to a coding scheme as in fig. 4 (assuming a number of data centers of 3).
Fitness function:
the invention aims to reduce the total time delay of the data layout of the scientific workflow, and the lower the total time delay is, the higher the particle quality is. However, the codes of the present invention are not robust and produce infeasible solution particles. There are two reasons for the impossibility of solving the problem, namely privacy disclosure and lack of satisfaction of capacity constraint, which are different fitness needs to be defined according to different situations. Wherein privacy disclosure indicates that at least one private data is copied or distributed to a non-corresponding data center, and that the capacity constraint is not satisfied indicates that at least one edge data center stores data exceeding the capacity constraint, and the illegal data set D is used inf To describe the data set that caused the particle to become an infeasible solution. The comparison of the fitness value F of two types of particles for a viable solution and an infeasible solution is divided into three cases.
Based on the comparison of fitness values F for both types of particles, a fitness function is established:
both particles compared are feasible solutions, and the particle fitness with lower total delay is better, and the fitness function is defined as follows:
F=T total ;
both particles compared are not feasible solutions, then the data set D is not resolvable inf The smaller length particles have better fitness, which means that more data is laid out in feasible locations, and become feasible solution particles in subsequent iterations more easily, and the fitness function is as follows:
F=|D inf |;
if the feasible solution particles and the infeasible solution particles are compared, selecting a feasible solution, wherein the fitness function is as follows:
particle update policy:
PSO (particle swarm optimization) uses particles to represent each solution in the search space, the velocity of the particles determining the direction and distance they fly, and the optimal solution is obtained by iterating the velocity and position of the particles continuously:
iterating the velocity and position of the particles:
in this example, the NPSO-GA used is an improvement to the PSO algorithm. The t-th update strategy for the ith particle in NPSO-GA is as follows:
The ith update policy for the ith particle is:
wherein ,Cg and Cp Is a crossover operator, M u Is a mutation operator, which is used for the mutation of the original data,is the individual history of particle i at the t-th iteration is optimal, g t Is the optimal population history at the t iteration, alpha 1 、α 2 And w is between 0 and 1, representing an acceleration factor and an inertial weight factor;
replacing an inertia part in the particle swarm algorithm by adopting a mutation operator of the genetic algorithm:
generating a random number r between 0 and 1 w If it is smaller than the inertia weight factor w, the particles undergo a mutation process M u As shown in algorithm 1:
in algorithm 1, first an insoluble dataset D of particles X i is acquired inf From the insoluble dataset D inf And a privacy dataset D fix Obtaining a variation position:
if D inf If there is no data, choose not to be at D fix Bit of a data correspondence of D inf If there is data in the list, select D inf The common data of (a) is divided into bits;
counting the copy number of the data corresponding to the position to be mutated of the statistical particles X i, if
X i [muIndex][j]=1;
Then it indicates that the data corresponding to the position to be mutated of particle X i has a copy on data center j;
updating the copy number copy count, increasing or decreasing the copy number copy count according to probability based on the original number, and ensuring that at least one copy exists and the copy number copy count does not exceed the number of the data center;
Is particle X i A data copy layout scheme with copy number of copy count is generated at the position to be mutated.
The overall mutation process not only results in a change in the layout position of the data, but also changes the number of copies, and fig. 5 is an example of a mutation process.
The individual cognition and social cognition parts in the particle swarm algorithm are replaced by adopting a crossover operator of the genetic algorithm:
wherein ,representing optimal crossing of particles and individual history, +.>Representing the optimal intersection of particles and population history.
The process of cross operation of particles and individual history optimization (population history optimization): after the mutation operation, a random number r between 0 and 1 is generated 1 (r 2 ) If it is less than or equal to the acceleration factor alpha 1 (α 2 ) Randomly selecting two bits of the particle, wherein a segment between the two bits is used as a crossing interval, and the segment in the crossing interval is replaced by a corresponding segment of p (or g), as shown in FIG. 6, which is a crossingExamples of fork procedures.
Parameter updating:
the larger inertia weight factor is beneficial to global searching, and local extremum is jumped out; and a smaller w is favorable for local search, so that the algorithm can be quickly converged to the optimal solution. In order to achieve the balance between the search speed and the search precision, the invention uses a strategy of nonlinear adjustment of the inertia weight w:
The strategy of nonlinear adjustment of the inertia weight is adopted, and the inertia weight is adjusted based on the difference degree of the current particle and the global particle:
representing the difference between the particles and the population optimal particles. When the value is larger, the difference between the current particle and the population optimum is larger, the inertia weight should be increased to perform global search, otherwise, the inertia weight should be reduced to perform local search, so that the algorithm can be quickly converged to the optimum solution.
Adjusting acceleration factor alpha using a linear variation strategy 1 and α2 :
As the number of iterations increases, α 1 Continuously decrease and alpha 2 The acceleration factor alpha is increased continuously, so that a larger acceleration factor alpha is obtained at the initial stage of iteration 1 And a smaller acceleration factor alpha 2 Searching a local optimal value in a smaller range, so that the particle searching is finer; obtaining smaller acceleration factor alpha in the later period of iteration 1 And a larger acceleration factor alpha 2 The global cooperation capability among particles is improved, and the particles can jump out of local optimum conveniently.
Data copy layout strategy overview:
in the algorithm 2, firstly, system initialization (1 st to 5 th lines), analysis of scientific workflow, and topology sequencing of tasks are carried out to obtain a task queue (1 st line) which can be executed sequentially;
Initializing the maximum capacity of a data center (line 2), generating an initialization population according to a privacy data set, wherein privacy data in the initialization population can be laid out on the corresponding data center, and public data is randomly laid out without generating other copies (line 3);
in this embodiment, the data layout process is simulated through the DataPlacement () function, whether the particles are feasible solutions is determined, if so, the total delay is calculated, and if not, an insoluble data set is recorded (line 4);
setting all individuals in the initial population as the optimal individual history, setting the optimal population history as the particles (line 5) with the best fitness in the initial population, and calculating the fitness of the particles;
iterative population (lines 6-12), variation of population according to inertial weight factor w (line 8), acceleration factor alpha 1 Crossing the population with the optimal population of the individual history according to the acceleration factor alpha 2 Crossing the population with the population history optimum (line 9), calculating the fitness of the new population, and updating global information (lines 10-11);
and outputting the total time delay with optimal population history at the end of iteration (line 12).
The data layout process comprises the following steps:
In this embodiment, the data layout process function dataPlaclement () returns fitness information of the population, records its total delay for feasible solution particles, and records its insoluble data set D for insoluble particles inf 。
The data layout process comprises the steps of:
initializing a task position list (taskLocList) for recording the execution positions of all tasks and an overrun identification (flagOverflow) for recording whether a data center exceeds the capacity limit of the data center in the task execution process (lines 1-4);
calculating capacity conditions of the data center after the initial data set is subjected to data layout (lines 5-7), calculating capacity conditions of the data center in the process of task execution, traversing task queues (lines 8-17), calculating execution positions of tasks and recording task position lists (lines 9-13);
when a task generates an output data set, temporarily storing the input data set and the output data set of the task on a data center, judging whether the data center exceeds capacity limit at the moment (lines 14-15), then distributing the output data of the task on the data center appointed by the task, and updating the capacity of the data center (line 16);
if the data center exceeds the capacity limit in the process of executing the task, recording the data distributed on the data center exceeding the capacity limit in the insoluble data set D inf And (lines 18-19), otherwise, calculating and recording the total delay (lines 20-22), including the data replication delay and the data transmission delay.
And S3, carrying out workflow data layout according to the solving result.
The second embodiment of the invention is as follows:
a storage medium having stored thereon a computer program for workflow data layout in a cloud-edge environment, characterized in that the computer program when executed performs the steps of a method for workflow data layout in a cloud-edge environment according to any of the preceding claims 1-9.
In summary, according to the workflow data layout method and the storage medium in the cloud environment provided by the invention, under the premise of considering the factors such as transmission bandwidth, data copy generation cost, data center capacity, privacy data and the like, the data copy is adaptively generated to optimize the transmission delay in the operation of the scientific workflow. And modeling the data copy layout into a 0-1 integer programming problem with the aim of minimizing the total time delay, generating copies for data used by high frequency according to the topological structure of the scientific workflow, and exchanging the cost of generating the copies for the transmission cost, thereby reducing the total time delay. The nonlinear inertial weight discrete particle swarm optimization algorithm based on the genetic algorithm operator is provided for solving the problem of data layout. The crossover operator and the mutation operator of the genetic algorithm are introduced into the particle swarm algorithm, so that the searching capability of the particle swarm algorithm is enhanced, premature convergence is avoided, and the inertia weight is adaptively adjusted according to the difference between the current particle and the global particle, so that the optimizing process is more efficient.
The core purpose of the invention is to minimize the time delay while meeting the storage capacity limit of the data privacy and the data center in the process of executing the scientific workflow.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent changes made by the specification and drawings of the present invention, or direct or indirect application in the relevant art, are included in the scope of the present invention.
Claims (10)
1. The method for arranging the workflow data in the cloud side environment is characterized by comprising the following steps:
s1, carrying out mathematical representation on a cloud edge environment, and modeling a data layout problem as a 0-1 integer programming problem based on copy generation cost and data transmission cost with the aim of minimizing total time delay to obtain a mathematical problem model;
s2, adopting a nonlinear inertial weight discrete particle swarm optimization algorithm based on a genetic algorithm operator, introducing a crossover operator and a mutation operator of the genetic algorithm into the particle swarm algorithm, and adaptively adjusting the inertial weight according to the difference between particles and global particles so as to solve the mathematical problem model;
and S3, carrying out workflow data layout according to the solving result.
2. The method for workflow data layout in a cloud-edge environment according to claim 1, wherein the mathematical representation of the cloud-edge environment in step S1 is specifically:
The cloud-edge environment is expressed as:
S={S cld ,S edg };
wherein, cloud computing S cld Comprising j data centers, denoted as:
S cld ={s 1 ,s 2 ,…,s j };
edge computation S edg Comprising k data centers, denoted as:
S edg ={s j+1 ,s j+2 ,…,s j+k };
each data center s i Expressed as:
s i =<c i ,γ i ,a i >;
wherein ,ci Representing its storage capacity, gamma i Representing data center type, gamma i ∈{0,1},γ i =0 represents that the data center is a cloud data center, and only public data, gamma, can be stored i The data center is denoted by 1 and can store public data and private data with fixed storage positions, a i Representing the speed at which the data center replicates data;
the network bandwidth between data centers is expressed as:
wherein ,bij Representing a data center s i And data center s j Is a bandwidth of (a);
the scientific workflow is expressed as:
G=(V,E,D);
wherein V represents a set of tasks in a scientific workflow:
V={v 1 ,v 2 ,…,v w };
e represents a set of task dependencies in a scientific workflow:
d represents a set of data replicas:
D={d 1 ,d 2 ,…,d m };
each task v i The relevant dataset is represented as<D i ,D o >,D i Representing its input dataset, D o Representing an output dataset thereof, the input dataset and the output dataset each consisting of one or more data, the inter-task dependencies e ij E, represent task v j Is task v i Is required at task v i Can be executed after completion, otherwise task v j For task v i Without dependence, each data copy set d i Comprising several copies of the ith data, d ij Representing it as the j-th copy of the i-th data, d i1 And (3) rendering the ith original data, each data copy containing attributes<z i1 ,n i1 ,f i1 ,l i1 >,z i1 Representing data size, n i1 Representing the number of copies of the data, which is an integer greater than 0, if n i1 =1, then indicates that the data has no other copies, f i1 Representing the generated data d i1 If dataF is the initial data i1 Is marked as 0,l i1 Record data d ij If data d ij Is privacy data, then l i1 Recording the data center to which the data center belongs, and if the data center is public data, l i1 Is 0.
3. The method for workflow data layout in a cloud-edge environment according to claim 2, wherein the modeling of the mathematical problem model in step S3 is specifically:
data d i1 In data center s k Copy overhead t of (2) copy The method comprises the following steps:
wherein ,zi1 Is data d ij Size of a), a k Is a data center s k The speed at which the data is copied;
data d ij From data centre s k Transmitted to data center s l Is the transmission overhead t of (2) tran The method comprises the following steps:
wherein bkl Is a data center s k And data center s l If the copy is copied and laid out to the current data center, no transmission overhead exists;
The data layout is expressed as { S, D, Y, T } total S is a data center set, D is a data set, Y is a layout position set of data, and all data D ij E D, all correspond to unique data centers:
T total for the total delay corresponding to the data placement scheme,for data replication time T copy And data transmission time T tran And (2) sum:
T total =T copy +T tran ;
data replication time T copy Expressed as:
data transmission time T tran Expressed as:
where h (i, j, k, l) ∈ {0,1}, h (i, j, k, l) =1 represents the l-th copy d of data k kl The slave data center s exists i To data centre s j Otherwise h (i, j, k, l) =0;
the targets of the data layout strategy are expressed as:
where β (i, j, k) ∈ {0,1}, β (i, j, k) =1 represents that the kth copy of data j is stored on data center i.
4. The method for workflow data layout in cloud-edge environment as recited in claim 1, wherein said step S3 comprises the steps of:
encoding the data layout strategy by adopting a two-dimensional array to construct candidate particles:
each bitRepresenting the copy set storage location of data j for the ith particle in the t-th iteration:
wherein ,qk ∈{0,1},q k =1 indicates that a copy of data j is laid out on data center k, otherwise indicates that a copy of data j is not laid out on data center k, x tij Middle q k The number of=1 represents the number of copies of data j.
5. The method of workflow data placement in a cloud-edge environment of claim 4, wherein solving the mathematical problem model according to the genetic algorithm operator based nonlinear inertial weight discrete particle swarm optimization algorithm comprises the steps of:
analyzing the scientific workflow, and performing topological ordering on the tasks to obtain a task queue capable of being sequentially executed;
initializing the maximum capacity of a data center, generating an initialization population according to a privacy data set, wherein privacy data in the initialization population can be laid out on the corresponding data center, and public data is randomly laid out without generating other copies;
simulating a data layout process, judging whether particles are feasible solutions, if so, calculating total time delay, and if not, recording an unreliable data set;
setting all individuals in the initial population as the optimal individual history, setting the optimal population history as the particles with the best fitness in the initial population, and calculating the fitness of the particles;
iterating the population, mutating the population according to the inertia weight factor w, and accelerating the population according to the acceleration factor alpha 1 Crossing the population with the optimal population of the individual history according to the acceleration factor alpha 2 Crossing the population with the population history optimum, calculating the adaptability of the new population, and updating the global information;
and outputting the total time delay with optimal population history when the iteration is finished.
6. The method of workflow data placement in a cloud-edge environment of claim 5, wherein the fitness calculation comprises:
based on the comparison of fitness values F for both types of particles, a fitness function is established:
both particles compared are feasible solutions, and the particle fitness with lower total delay is better, and the fitness function is defined as follows:
F=T total ;
both particles compared are not feasible solutions, then the data set D is not resolvable inf The smaller length particles have better fitness, which means that more data is laid out in feasible locations, and become feasible solution particles in subsequent iterations more easily, and the fitness function is as follows:
F=|D inf |;
if the feasible solution particles and the infeasible solution particles are compared, selecting a feasible solution, wherein the fitness function is as follows:
7. the method for workflow data placement in a cloud-edge environment of claim 5, wherein the data placement process comprises the steps of:
Initializing a task position list for recording the execution positions of all tasks and an overrun mark for recording whether a data center exceeds the capacity limit of the data center in the task execution process;
calculating the capacity condition of the data center after the initial data set is subjected to data layout, traversing the task queue, calculating the execution position of the task and recording the execution position of the task into a task position list;
when a task generates an output data set, temporarily storing the input data set and the output data set of the task on a data center, judging whether the data center exceeds capacity limit at the moment, then distributing the output data of the task on the data center designated by the task, and updating the capacity of the data center;
if the data center exceeds the capacity limit in the process of executing the task, recording the data distributed on the data center exceeding the capacity limit in the insoluble data set D inf And if not, calculating and recording the total time delay.
8. The method for workflow data layout in a cloud environment as claimed in claim 7, wherein in the nonlinear inertial weight discrete particle swarm optimization algorithm based on genetic algorithm operator in step S3, introducing crossover and mutation operators of genetic algorithm in the particle swarm algorithm comprises the steps of:
Iterating the velocity and position of the particles:
the ith update policy for the ith particle is:
wherein ,Cg and Cp Is a crossover operator, M u Is a mutation operator, which is used for the mutation of the original data,is the individual history of particle i at the t-th iteration is optimal, g t Is the optimal population history at the t iteration, alpha 1 、α 2 And w is between 0 and 1, representing an acceleration factor and an inertial weight factor;
replacing an inertia part in the particle swarm algorithm by adopting a mutation operator of the genetic algorithm:
generating a random number r between 0 and 1 w If it is smaller than the inertial weight factor w, the particles undergo mutation:
acquisition of an insoluble data set D of particles X i inf From the insoluble dataset D inf And a privacy dataset D fix Obtaining a variation position:
if D inf If there is no data, choose not to be at D fix Bit of a data correspondence of D inf If there is data in the list, select D inf The common data of (a) is divided into bits;
counting the copy number of the data corresponding to the position to be mutated of the particle Xi, if
X i [muIndex][j]=1;
The corresponding data of the position to be mutated of the particle Xi is shown to have a copy on the data center j;
updating the copy number copy count, increasing or decreasing the copy number copy count according to probability based on the original number, and ensuring that at least one copy exists and the copy number copy count does not exceed the number of the data center;
Is particle X i Generating a data copy layout scheme with copy number of copy count at the position to be mutated;
the individual cognition and social cognition parts in the particle swarm algorithm are replaced by adopting a crossover operator of the genetic algorithm:
9. The method for workflow data layout in a cloud environment according to claim 5, wherein in the nonlinear inertial weight discrete particle swarm optimization algorithm based on genetic algorithm operator in step S3, the self-adaptive adjustment of the inertial weight according to the difference between the particles and the global particles comprises the steps of:
a strategy of non-linearly adjusting the inertial weight is adopted, adjusting inertial weights based on the degree of difference of the current particle and the global particle:
adjusting acceleration factor alpha using a linear variation strategy 1 and α2 :
10. A storage medium having stored thereon a computer program for workflow data layout in a cloud-edge environment, characterized in that the computer program when executed performs the steps of a method for workflow data layout in a cloud-edge environment according to any of the preceding claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310176231.7A CN116050235A (en) | 2023-02-28 | 2023-02-28 | Workflow data layout method under cloud side environment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310176231.7A CN116050235A (en) | 2023-02-28 | 2023-02-28 | Workflow data layout method under cloud side environment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116050235A true CN116050235A (en) | 2023-05-02 |
Family
ID=86127427
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310176231.7A Pending CN116050235A (en) | 2023-02-28 | 2023-02-28 | Workflow data layout method under cloud side environment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116050235A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116955354A (en) * | 2023-06-30 | 2023-10-27 | 国家电网有限公司大数据中心 | Identification analysis method and device for energy digital networking |
-
2023
- 2023-02-28 CN CN202310176231.7A patent/CN116050235A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116955354A (en) * | 2023-06-30 | 2023-10-27 | 国家电网有限公司大数据中心 | Identification analysis method and device for energy digital networking |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Guo et al. | Cloud resource scheduling with deep reinforcement learning and imitation learning | |
Das et al. | Recent advances in differential evolution–an updated survey | |
Xu et al. | Chemical reaction optimization for task scheduling in grid computing | |
CN113098714B (en) | Low-delay network slicing method based on reinforcement learning | |
US7882052B2 (en) | Evolutionary neural network and method of generating an evolutionary neural network | |
CN108875955A (en) | Gradient based on parameter server promotes the implementation method and relevant device of decision tree | |
CN109522104B (en) | Method for optimizing scheduling of two target tasks of Iaas by using differential evolution algorithm | |
Tawhid et al. | A hybrid social spider optimization and genetic algorithm for minimizing molecular potential energy function | |
CN115168281B (en) | Neural network on-chip mapping method and device based on tabu search algorithm | |
CN116050235A (en) | Workflow data layout method under cloud side environment and storage medium | |
Deb et al. | Classifying metamodeling methods for evolutionary multi-objective optimization: first results | |
CN115293623A (en) | Training method and device for production scheduling model, electronic equipment and medium | |
CN111414961A (en) | Task parallel-based fine-grained distributed deep forest training method | |
AlSuwaidan et al. | Swarm Intelligence Algorithms for Optimal Scheduling for Cloud‐Based Fuzzy Systems | |
Ming et al. | Intelligent approaches to tolerance allocation and manufacturing operations selection in process planning | |
TWI758223B (en) | Computing method with dynamic minibatch sizes and computing system and computer-readable storage media for performing the same | |
Younis et al. | Genetic algorithm for independent job scheduling in grid computing | |
CN110175172B (en) | Extremely-large binary cluster parallel enumeration method based on sparse bipartite graph | |
CN108289115A (en) | A kind of information processing method and system | |
Ho et al. | Adaptive communication for distributed deep learning on commodity GPU cluster | |
Spivak et al. | Storage tier-aware replicative data reorganization with prioritization for efficient workload processing | |
Dou et al. | A genetic algorithm with path-relinking for operation sequencing in CAPP | |
CN101378406A (en) | Method for selecting data grid copy | |
CN115242796B (en) | Task scheduling method for cloud-edge-end scene | |
Lvovich et al. | Optimization of internet of things system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |