wfcommons.generator¶
wfcommons.generator.generator¶
-
class
wfcommons.generator.generator.
WorkflowGenerator
(workflow_recipe: wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe, logger: Optional[logging.Logger] = None) Bases:
object
A generator of synthetic workflow traces based on workflow recipes obtained from the analysis of real workflow execution traces.
Parameters: - workflow_recipe (WorkflowRecipe) – The workflow recipe to be used for this generator.
- logger (Logger) – The logger where to log information/warning or errors (optional).
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace based on the workflow recipe used to instantiate the generator.
Parameters: workflow_name (str) – The workflow name. Returns: A synthetic workflow trace object. Return type: Workflow
-
build_workflows
(num_workflows: int) → List[wfcommons.common.workflow.Workflow] Generate a number of synthetic workflow traces based on the workflow recipe used to instantiate the generator.
Parameters: num_workflows (int) – The number of workflows to be generated. Returns: A list of synthetic workflow trace objects. Return type: List[Workflow]
wfcommons.generator.workflow.abstract_recipe¶
-
class
wfcommons.generator.workflow.abstract_recipe.
WorkflowRecipe
(name: str, data_footprint: Optional[int], num_tasks: Optional[int], runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0, logger: Optional[logging.Logger] = None)¶ Bases:
abc.ABC
An abstract class of workflow recipes for creating synthetic workflow traces.
Parameters: - name (str) – The workflow recipe name.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
- logger (Logger) – The logger where to log information/warning or errors (optional).
-
_abc_impl
= <_abc_data object>¶
-
_generate_file
(extension: str, recipe: Dict[str, Any], link: wfcommons.common.file.FileLink) → wfcommons.common.file.File¶ Generate a file according to a file recipe.
Parameters: - extension (str) –
- recipe (Dict[str, Any]) – Recipe for generating the file.
- link (FileLink) – Type of file link.
Returns: The generated file.
Return type:
-
_generate_files
(task_id: str, recipe: Dict[str, Any], link: wfcommons.common.file.FileLink, files_recipe: Optional[Dict[wfcommons.common.file.FileLink, Dict[str, int]]] = None) → None¶ Generate files for a specific task ID.
Parameters:
-
_generate_task
(task_name: str, task_id: str, input_files: Optional[List[wfcommons.common.file.File]] = None, files_recipe: Optional[Dict[wfcommons.common.file.FileLink, Dict[str, int]]] = None) → wfcommons.common.task.Task¶ Generate a synthetic task.
Parameters: Returns: A task object.
Return type: task
-
_generate_task_name
(prefix: str) → str¶ Generate a task name from a prefix appended with an ID.
Parameters: prefix (str) – task prefix. Returns: task name from prefix appended with an ID. Return type: str
-
_get_files_by_task_and_link
(task_id: str, link: wfcommons.common.file.FileLink) → List[wfcommons.common.file.File]¶ Get the list of files for a task ID and link type.
Parameters: - task_id (str) – task ID.
- link (FileLink) – Type of file link.
Returns: List of files for a task ID and link type.
Return type: List[File]
-
_workflow_recipe
() → Dict[str, Any]¶ Recipe for generating synthetic traces for a workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow¶ Generate a synthetic workflow trace.
Parameters: workflow_name (str) – The workflow name. Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe¶ Instantiate a workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
wfcommons.generator.workflow.blast_recipe¶
-
class
wfcommons.generator.workflow.blast_recipe.
BLASTRecipe
(num_subsample: Optional[int] = 2, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 5, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
A BLAST workflow recipe class for creating synthetic workflow traces.
Parameters: - num_subsample (int) – The number of subsample the reference file will be split.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the BLAST workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a BLAST workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_subsample
(num_subsample: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.blast_recipe.BLASTRecipe Instantiate a BLAST workflow recipe that will generate synthetic workflows using the defined number of subsample.
Parameters: - num_subsample (int) – The number of subsample the reference file will be split.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A BLAST workflow recipe object that will generate synthetic workflows using the defined number of subsample.
Return type:
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.blast_recipe.BLASTRecipe Instantiate a BLAST workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 5).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A BLAST workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
wfcommons.generator.workflow.bwa_recipe¶
-
class
wfcommons.generator.workflow.bwa_recipe.
BWARecipe
(num_subsample: Optional[int] = 2, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 5, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
A BLAST workflow recipe class for creating synthetic workflow traces.
Parameters: - num_subsample (int) – The number of subsample the reference file will be split.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the BWA workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a BWA workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_subsample
(num_subsample: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.bwa_recipe.BWARecipe Instantiate a BWA workflow recipe that will generate synthetic workflows using the defined number of subsample.
Parameters: - num_subsample (int) – The number of subsample the reference file will be split.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A BWA workflow recipe object that will generate synthetic workflows using the defined number of subsample.
Return type:
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.bwa_recipe.BWARecipe Instantiate a BWA workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 6).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A BWA workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
wfcommons.generator.workflow.cycles_recipe¶
-
class
wfcommons.generator.workflow.cycles_recipe.
CyclesRecipe
(num_points: Optional[int] = 1, num_crops: Optional[int] = 1, num_params: Optional[int] = 4, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 7, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
A Cycles workflow recipe class for creating synthetic workflow traces.
Parameters: - num_points (int) – The number of points of the spatial grid cell.
- num_crops (int) – The number of crops being evaluated.
- num_params (int) – The number of parameter values from the simulation matrix.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the Cycles workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a Cycles workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.cycles_recipe.CyclesRecipe Instantiate a Cycles workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 7).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A Cycles workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
-
classmethod
from_points_and_crops
(num_points: int, num_crops: int, num_params: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.cycles_recipe.CyclesRecipe Instantiate a Cycles workflow recipe that will generate synthetic workflows using the defined number of points, crops, and params.
Parameters: - num_points (int) – The number of points of the spatial grid cell.
- num_crops (int) – The number of crops being evaluated.
- num_params (int) – The number of parameter values from the simulation matrix.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A Cycles workflow recipe object that will generate synthetic workflows using the defined number of points, crops, and params.
Return type:
wfcommons.generator.workflow.epigenomics_recipe¶
-
class
wfcommons.generator.workflow.epigenomics_recipe.
EpigenomicsRecipe
(num_sequence_files: Optional[int] = 1, num_lines: Optional[int] = 10, bin_size: Optional[int] = 10, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 9, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
An Epigenomics workflow recipe class for creating synthetic workflow traces.
Parameters: - num_sequence_files (int) – Number of FASTQ files processed by the workflow.
- num_lines (int) – Number of lines in each FASTQ file.
- bin_size (int) – Number of DNA and protein sequence information to be processed by each computational task.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the Epigenomics workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: str = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of an Epigenomics workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.epigenomics_recipe.EpigenomicsRecipe Instantiate an Epigenomics workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 9).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: An Epigenomics workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
-
classmethod
from_sequences
(num_sequence_files: int, num_lines: int, bin_size: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.epigenomics_recipe.EpigenomicsRecipe Instantiate an Epigenomics workflow recipe that will generate synthetic workflows using the defined number of sequence files, lines, and bin size.
Parameters: - num_sequence_files (int) – Number of FASTQ files processed by the workflow.
- num_lines (int) – Number of lines in each FASTQ file.
- bin_size (int) – Number of DNA and protein sequence information to be processed by each computational task.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: An Epigenomics workflow recipe object that will generate synthetic workflows using the defined number of sequence files, lines, and bin size.
Return type:
wfcommons.generator.workflow.genome_recipe¶
-
class
wfcommons.generator.workflow.genome_recipe.
GenomeRecipe
(num_chromosomes: Optional[int] = 1, num_sequences: Optional[int] = 1, num_populations: Optional[int] = 1, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 5, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
A 1000Genome workflow recipe class for creating synthetic workflow traces.
Parameters: - num_chromosomes (int) – The number of chromosomes evaluated in the workflow execution.
- num_sequences (int) – The number of sequences per chromosome file.
- num_populations (int) – The number of populations being evaluated.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_get_populations_files_recipe
(index: int) → Dict[wfcommons.common.file.FileLink, Dict[str, int]] Get the recipe for generating a population file.
Parameters: index (int) – Index of the population in the list. Returns: Recipe for generating a population file. Return type: Dict[FileLink, Dict[str, int]]
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the 1000Genome workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: str = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a 1000Genome workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_chromosomes
(num_chromosomes: int, num_sequences: int, num_populations: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.genome_recipe.GenomeRecipe Instantiate a 1000Genome workflow recipe that will generate synthetic workflows using the defined number of chromosomes, sequences, and populations.
Parameters: - num_chromosomes (int) – The number of chromosomes evaluated in the workflow execution.
- num_sequences (int) – The number of sequences per chromosome file.
- num_populations (int) – The number of populations being evaluated.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A 1000Genome workflow recipe object that will generate synthetic workflows using the defined number of chromosomes, sequences, and populations.
Return type:
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.genome_recipe.GenomeRecipe Instantiate a 1000Genome workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 5).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A 1000Genome workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
wfcommons.generator.workflow.montage_recipe¶
-
class
wfcommons.generator.workflow.montage_recipe.
MontageDataset
Bases:
wfcommons.utils.NoValue
An enumeration of Montage datasets.
-
DSS
= 'dss'
-
TWOMASS
= '2mass'
-
-
class
wfcommons.generator.workflow.montage_recipe.
MontageRecipe
(dataset: Optional[wfcommons.generator.workflow.montage_recipe.MontageDataset] = <MontageDataset.DSS>, num_bands: Optional[int] = 1, degree: Optional[float] = 0.5, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 133, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
,wfcommons.generator.workflow.montage_recipe._MontagetaskRatios
A Montage workflow recipe class for creating synthetic workflow traces. In this workflow recipe, traces will follow different recipes for different
MontageDataset
.Parameters: - dataset (MontageDataset) – The dataset to use for the mosaic (e.g., 2mass, dss).
- num_bands (int) – The number of bands (e.g., red, blue, and green) used by the workflow.
- degree (float) – The size (in degrees) to be used for the width/height of the final mosaic.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the Montage workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: str = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a Montage workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_degree
(dataset: wfcommons.generator.workflow.montage_recipe.MontageDataset, num_bands: int, degree: float, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.montage_recipe.MontageRecipe Instantiate a Montage workflow recipe that will generate synthetic workflows using the defined dataset, number of bands, and degree.
Parameters: - dataset (MontageDataset) – The dataset to use for the mosaic (e.g., 2mass, dss).
- num_bands (int) – The number of bands (e.g., red, blue, and green) used by the workflow (at least 1).
- degree (float) – The size (in degrees) to be used for the width/height of the final mosaic (at least 0.5).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A Montage workflow recipe object that will generate synthetic workflows using the defined dataset, number of bands, and degree.
Return type:
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.montage_recipe.MontageRecipe Instantiate a Montage workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 133).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A Montage workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
-
class
wfcommons.generator.workflow.montage_recipe.
_MontagetaskRatios
Bases:
object
An auxiliary class for generating Montage tasks.
-
_get_max_num_tasks
(task_name: str, degree: float, dataset: wfcommons.generator.workflow.montage_recipe.MontageDataset) → int Get the maximum number of tasks that can be generated for a defined task.
Parameters: - task_name (str) – The task name prefix.
- degree (float) – The size (in degrees) to be used for the width/height of the final mosaic.
- dataset (MontageDataset) – The dataset to use for the mosaic (e.g., 2mass, dss).
Returns: The maximum number of tasks that can be generated for a defined task.
Return type: int
-
_get_max_rate_increase
(task_name: str, dataset: wfcommons.generator.workflow.montage_recipe.MontageDataset) → int Get the maximum rate of increase for a task prefix by increasing the workflow degree.
Parameters: - task_name (str) – The task name prefix.
- dataset (MontageDataset) – The dataset to use for the mosaic (e.g., 2mass, dss).
Returns: The maximum rate of increase for a task prefix by increasing the workflow degree.
Return type: int
-
_get_num_tasks
(task_name: str, degree: float, dataset: wfcommons.generator.workflow.montage_recipe.MontageDataset) → int Get a random number of tasks to be generated for a task prefix and workflow degree.
Parameters: - task_name (str) – The task name prefix.
- degree (float) – The size (in degrees) to be used for the width/height of the final mosaic.
- dataset (MontageDataset) – The dataset to use for the mosaic (e.g., 2mass, dss).
Returns: A random number of tasks to be generated for a task prefix and workflow degree.
Return type: int
-
tasks_ratios
= {<MontageDataset.TWOMASS>: {'mProject': (68, 44, 21), 'mDiffFit': (414, 112, 52), 'mBackground': (68, 23, 4)}, <MontageDataset.DSS>: {'mProject': (4, 4, 4), 'mDiffFit': (120, 134, 118), 'mBackground': (4, 4, 4)}}
-
wfcommons.generator.workflow.seismology_recipe¶
-
class
wfcommons.generator.workflow.seismology_recipe.
SeismologyRecipe
(num_pairs: Optional[int] = 2, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 3, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
A Seismology workflow recipe class for creating synthetic workflow traces.
Parameters: - num_pairs (int) – The number of pair of signals to estimate earthquake STFs.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the Seismology workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a Seismology workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_pairs
(num_pairs: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.seismology_recipe.SeismologyRecipe Instantiate a Seismology workflow recipe that will generate synthetic workflows using the defined number of pairs.
Parameters: - num_pairs (int) – The number of pair of signals to estimate earthquake STFs (at least 2).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A Seismology workflow recipe object that will generate synthetic workflows using the defined number of pairs.
Return type:
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.seismology_recipe.SeismologyRecipe Instantiate a Seismology workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 3).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A Seismology workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
wfcommons.generator.workflow.soykb_recipe¶
-
class
wfcommons.generator.workflow.soykb_recipe.
SoyKBRecipe
(num_fastq_files: Optional[int] = 2, num_chromosomes: Optional[int] = 1, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 14, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
A SoyKB workflow recipe class for creating synthetic workflow traces.
Parameters: - num_fastq_files (int) – The number of FASTQ files to be analyzed.
- num_chromosomes (int) – The number of chromosomes.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the SoyKB workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of a SoyKB workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.soykb_recipe.SoyKBRecipe Instantiate a SoyKB workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 14).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A SoyKB workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type:
-
classmethod
from_sequences
(num_fastq_files: int, num_chromosomes: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.soykb_recipe.SoyKBRecipe Instantiate a SoyKB workflow recipe that will generate synthetic workflows using the defined number of FASTQ files and chromosomes.
Parameters: - num_fastq_files (int) – The number of FASTQ files to be analyzed (at least 2).
- num_chromosomes (int) – The number of chromosomes (range [1,22].
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: A SoyKB workflow recipe object that will generate synthetic workflows using the defined number of FASTQ files and chromosomes.
Return type:
wfcommons.generator.workflow.srasearch_recipe¶
-
class
wfcommons.generator.workflow.srasearch_recipe.
SRASearchRecipe
(num_accession: Optional[int] = 2, data_footprint: Optional[int] = 0, num_tasks: Optional[int] = 3, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) Bases:
wfcommons.generator.workflow.abstract_recipe.WorkflowRecipe
An SRA Search workflow recipe class for creating synthetic workflow traces.
Parameters: - num_accession (int) – The number of NCBI accession numbers.
- data_footprint (int) – The upper bound for the workflow total data footprint (in bytes).
- num_tasks (int) – The upper bound for the total number of tasks in the workflow.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
-
_abc_impl
= <_abc_data object>
-
_add_merge_task
(workflow, input_files, parents) → wfcommons.common.task.Task Create a merge task.
Parameters: - workflow – Workflow object instance.
- input_files – List of input files for the task.
- parents – List of parent tasks.
Rtype workflow: Workflow
Rtype input_files: List[File]
Rtype parents: List[Task]
Returns: A merge task object.
-
_workflow_recipe
() → Dict[KT, VT] Recipe for generating synthetic traces of the SRA Search workflow. Recipes can be generated by using the
TraceAnalyzer
.Returns: A recipe in the form of a dictionary in which keys are task prefixes. Return type: Dict[str, Any]
-
build_workflow
(workflow_name: Optional[str] = None) → wfcommons.common.workflow.Workflow Generate a synthetic workflow trace of an SRA Search workflow.
Parameters: workflow_name (int) – The workflow name Returns: A synthetic workflow trace object. Return type: Workflow
-
classmethod
from_num_accession
(num_accession: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.srasearch_recipe.SRASearchRecipe Instantiate an SRA Search workflow recipe that will generate synthetic workflows using the defined number of pairs.
Parameters: - num_accession (int) – The number of NCBI accession numbers.
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: An SRA Search workflow recipe object that will generate synthetic workflows using the defined number of pairs.
Return type:
-
classmethod
from_num_tasks
(num_tasks: int, runtime_factor: Optional[float] = 1.0, input_file_size_factor: Optional[float] = 1.0, output_file_size_factor: Optional[float] = 1.0) → wfcommons.generator.workflow.srasearch_recipe.SRASearchRecipe Instantiate an SRA Search workflow recipe that will generate synthetic workflows up to the total number of tasks provided.
Parameters: - num_tasks (int) – The upper bound for the total number of tasks in the workflow (at least 6).
- runtime_factor (float) – The factor of which tasks runtime will be increased/decreased.
- input_file_size_factor (float) – The factor of which tasks input files size will be increased/decreased.
- output_file_size_factor (float) – The factor of which tasks output files size will be increased/decreased.
Returns: An SRA Search workflow recipe object that will generate synthetic workflows up to the total number of tasks provided.
Return type: