Reputation
Badges 1
28 × Eureka!@<1523701070390366208:profile|CostlyOstrich36> Yes SDK 🙂
I see that that is not possible, but I also see that report_histogram is there (which does reporting of the plotly)
and was wondering is there any way to report custom plotly when I have my own layeout
but how pass the file as argument, idk
you can do it with docker_bash_setup_script where you run conda install what you need
But I want to use value from the arguments.
AgitatedDove14 Yeah, you are right since sub component is not a task than I the caching won't work. but it is a step result what's important so if the step cache is available I think it should cover the majority of pipeline usecases.
Yes it did work! Thank you!
Yes, I that's what I found, otherwise clearml won't be able to see this function during execution time. I think it would be great to have such possibility because step can be constructed with multiple sub-components but not all of them might be added to the UI graph. Some of them are just helper functions which will make code more readable
AnxiousSeal95 Thank you so much! I will use it.
no, because I want to use pipe.add_parameter in the docker field of the pipe.add_function_step not in the function_kwargs
TimelyMouse69 the main problem is the arguments here is the code snippetpipeline = PipelineController( name="Awesome Pipeline") pipeline.add_parameter( "docker_image", default="DEFAULT_DOCKER")And the I have functional step, where I want to use the argumentpipeline.add_function_step(name="best_step", docker="${pipeline.docker_image}"And also I tried
` parameters = pipeline.get_parameters()
pipeline.add_function_step(name="best_step", docker=parameters["docker_image"...
I'm running 1.7.0 (latest docker available).
Your example did work for me, but I'm gonna try the flush() method now
AnxiousSeal95 here
function_kwargs they work , but docker parameter not
the intuition is: I care of the step result, and I also care what are the sub-steps in the step.
Example: step – evaluate model , consists of dataset + model. I need substeps
download dataset download models evaluateI do not really care what will be in the substeps metrics, but I care what is stored in the evaluate model step. It will make everything compact and easily accessable
Hi CostlyOstrich36
Here is the code example which does not work for me
` def process_data(inputs):
import pandas as pd
from clearml import PipelineController
_logger = PipelineController.get_logger()
df = pd.DataFrame.create(inputs)
_logger.report_table('Awesome', 'Details', table_plot=df)
pipeline = PipelineController(name='best_pipeline', project='test')
pipeline.add_function_step(name='process_data', function=process_data,
function_kw...
AgitatedDove14 maybe you have idea how to deal with the second issue? because this is exactly what I want to get 🙂
When I add sleep to the process_data it works if it was enough time to upload data
def process_data(inputs): import time import pandas as pd from clearml import PipelineController _logger = PipelineController.get_logger() df = pd.DataFrame.create(inputs) _logger.report_table('Awesome', 'Details', table_plot=df) time.sleep(10)
I agree, a lot of packages should be installed before I can execute any command, but having something like "sub nodes" inside pipeline, in my opinion, makes them much more useful, in sense that all the steps are visible. I haven't used pipelines before and when I saw this UI I was thinking it would be very cool highlight the execution steps.
CostlyOstrich36 maybe you have any idea why this code might not work for me?
From my experience with the pipeline so far and "sub-node" idea, I would say:
Keep pipeline controller with possibility to define where to run whole pipeline (same node/pod) Every step can be pushed to be executed on different pod Every step is a Task but step can consist of multiple function which are "sub-node" and they must be executed on the same pod/node where the functional_step is defined.
As a result if the pipeline requires sharing large files select the pipeline to ru...
pipe.add_function_step(
name='step_one',
function=step_one,
function_kwargs=dict(pickle_data_url='${pipeline.url}'),
function_return=['data_frame'],
cache_executed_step=True,
#############
docker='${pipeline.url}' !!!!!! this does not work
) ``
I have another issue with pipelines, I have described it in the another thread would you mind if I tag you there? because no solution 😞
Yes, but I'm not sure that they need to have separate task. In my opinion, it would be better if they are visible in the UI but all the metrics/artifacts are reported to the step Task
AgitatedDove14 thank for the link, but I need a different thing.
Step 1 of the pipeline I download images from s3 (many of them) and want to return paths Step 2 of the pipeline read images from that pathHere is a psedocode
` def step_one():
download_dataset = StorageManger.get_local_copy()
paths = collect_pathes_as_strings()
return paths
def step_two(paths):
image_1 = read_image(paths[0]) `
Tried and as output clearml-agent is trying to pull image '${pipeline.docker_image}' can not convert it to the value
My reasoning is that pipelines can give me good visual overview of what is going on and I want to have a lot of small steps. My dataset is 2 Gb of images, and I want to have a step where I download it with StorageManger.get_local_copy() save it and pass to the next steps only path to this datasets. But every agent is a different pod so I do not know how properly share the folder with images.
Yes I think it absolutely fine. Here is the pseudocode of my understanding with ClearML syntax:
`
def complex_steps(args):
As far as I see the functions should be implemented inside the step for ClearML be able to see them
@sub_node
def action_1(params):
....
return result
@sub_node
def action_2(params):
....
return result
@sub_node
def action_3(params_1, params_2):
....
return result
act1_result = action_1(args.param1)
...
Hi AgitatedDove14 storage.
Step 1 of the pipeline – generate file Step 2 of the pipeline – read file generated at the step 1