Reputation
Badges 1
28 × Eureka!you can do it with docker_bash_setup_script
where you run conda install
what you need
but how pass the file as argument, idk
Pass this to the func_step
docker_args="--env CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1",
I have another issue with pipelines, I have described it in the another thread would you mind if I tag you there? because no solution 😞
no, because I want to use pipe.add_parameter
in the docker
field of the pipe.add_function_step
not in the function_kwargs
AnxiousSeal95 Thank you so much! I will use it.
Tried and as output clearml-agent
is trying to pull image '${pipeline.docker_image}'
can not convert it to the value
pipe.add_function_step(
name='step_one',
function=step_one,
function_kwargs=dict(pickle_data_url='${pipeline.url}'),
function_return=['data_frame'],
cache_executed_step=True,
#############
docker='${pipeline.url}' !!!!!! this does not work
) ``
function_kwargs
they work , but docker
parameter not
Yes it did work! Thank you!
Hi CostlyOstrich36
Here is the code example which does not work for me
` def process_data(inputs):
import pandas as pd
from clearml import PipelineController
_logger = PipelineController.get_logger()
df = pd.DataFrame.create(inputs)
_logger.report_table('Awesome', 'Details', table_plot=df)
pipeline = PipelineController(name='best_pipeline', project='test')
pipeline.add_function_step(name='process_data', function=process_data,
function_kw...
When I add sleep
to the process_data
it works if it was enough time to upload data
def process_data(inputs): import time import pandas as pd from clearml import PipelineController _logger = PipelineController.get_logger() df = pd.DataFrame.create(inputs) _logger.report_table('Awesome', 'Details', table_plot=df) time.sleep(10)
I'm running 1.7.0 (latest docker available).
Your example did work for me, but I'm gonna try the flush()
method now
CostlyOstrich36 maybe you have any idea why this code might not work for me?
TimelyMouse69 the main problem is the arguments here is the code snippetpipeline = PipelineController( name="Awesome Pipeline") pipeline.add_parameter( "docker_image", default="DEFAULT_DOCKER")
And the I have functional step, where I want to use the argumentpipeline.add_function_step(name="best_step", docker="${pipeline.docker_image}"
And also I tried
` parameters = pipeline.get_parameters()
pipeline.add_function_step(name="best_step", docker=parameters["docker_image"...
I agree, a lot of packages should be installed before I can execute any command, but having something like "sub nodes" inside pipeline, in my opinion, makes them much more useful, in sense that all the steps are visible. I haven't used pipelines before and when I saw this UI I was thinking it would be very cool highlight the execution steps.
From my experience with the pipeline so far and "sub-node" idea, I would say:
Keep pipeline controller with possibility to define where to run whole pipeline (same node/pod) Every step can be pushed to be executed on different pod Every step is a Task but step can consist of multiple function which are "sub-node" and they must be executed on the same pod/node where the functional_step
is defined.
As a result if the pipeline requires sharing large files select the pipeline to ru...
Yes, I that's what I found, otherwise clearml won't be able to see this function during execution time. I think it would be great to have such possibility because step can be constructed with multiple sub-components
but not all of them might be added to the UI graph. Some of them are just helper functions which will make code more readable
AgitatedDove14 Yeah, you are right since sub component is not a task than I the caching won't work. but it is a step result what's important so if the step cache is available I think it should cover the majority of pipeline usecases.
Yes, but I'm not sure that they need to have separate task. In my opinion, it would be better if they are visible in the UI but all the metrics/artifacts are reported to the step Task
My reasoning is that pipelines can give me good visual overview of what is going on and I want to have a lot of small steps. My dataset is 2 Gb of images, and I want to have a step where I download it with StorageManger.get_local_copy()
save it and pass to the next steps only path to this datasets. But every agent is a different pod so I do not know how properly share the folder with images.
AgitatedDove14 thank for the link, but I need a different thing.
Step 1 of the pipeline I download images from s3 (many of them) and want to return paths Step 2 of the pipeline read images from that pathHere is a psedocode
` def step_one():
download_dataset = StorageManger.get_local_copy()
paths = collect_pathes_as_strings()
return paths
def step_two(paths):
image_1 = read_image(paths[0]) `
Yes I think it absolutely fine. Here is the pseudocode of my understanding with ClearML syntax:
`
def complex_steps(args):
As far as I see the functions should be implemented inside the step for ClearML be able to see them
@sub_node
def action_1(params):
....
return result
@sub_node
def action_2(params):
....
return result
@sub_node
def action_3(params_1, params_2):
....
return result
act1_result = action_1(args.param1)
...
AgitatedDove14 maybe you have idea how to deal with the second issue? because this is exactly what I want to get 🙂
Hi AgitatedDove14 storage.
Step 1 of the pipeline – generate file Step 2 of the pipeline – read file generated at the step 1
@<1523701070390366208:profile|CostlyOstrich36> Yes SDK 🙂
I see that that is not possible, but I also see that report_histogram
is there (which does reporting of the plotly)
and was wondering is there any way to report custom plotly when I have my own layeout
the intuition is: I care of the step result, and I also care what are the sub-steps in the step.
Example: step – evaluate model
, consists of dataset + model. I need substeps
download dataset download models evaluateI do not really care what will be in the substeps metrics, but I care what is stored in the evaluate model
step. It will make everything compact and easily accessable
But I want to use value from the arguments.