Yes, you need to call the function every time. The remote run might have some parameters populated which you can use, but the pipeline function needs to be called if you actually want to run the pipeline.
If the task is running remotely and the parameters are populated, then the local run parameters will not be used, instead the parameters that are already on the task will be used. This is because we want to allow users to change these parameters in the UI if they want to - so the paramters that are in the code are ignored in the favor of the ones in the UI
Hi OutrageousSheep60 . The list_datasets
function is currently broken and will be fixed next release
Hi DrabOwl94 Looks like this is a bug. Strange no one found it until now. Anyway, you can just add a --params-override
at the end of the command line and it should work (and --max-iteration-per-job <YOUR_INT>
and --total-max-job <YOUR_INT>
as Optuna requires this). We will fix this one in the next patch.
Also, could you please open a Github issue? It should contain your command line and this error.
Thank you
Hi @<1533620191232004096:profile|NuttyLobster9> ! PipelineDecorator.get_current_pipeline
will return a PipelineDecorator
instance (which inherits from PipelineController
) once the pipeline function has been called. So
pipeline = PipelineDecorator.get_current_pipeline()
pipeline(*args)
doesn't really make sense. You should likely call pipeline = build_pipeline(*args)
instead
Hi @<1523702000586330112:profile|FierceHamster54> ! This is currently not possible, but I have a workaround in mind. You could use the artifact_serialization_function
parameter in your pipeline. The function should return a bytes stream of the zipped content of your data with whichever compression level you have in mind.
If I'm not mistaken, you wouldn't even need to write a deserialization function in your case, because we should be able to unzip your data just fine.
Wdyt?
After you do s['Function']['random_number'] = random.random()
you still need to call set_parameters_as_dict(s)
What clearml sdk version are you using?
Hi @<1546303293918023680:profile|MiniatureRobin9> The PipelineController
has a property called id
, so just doing something like pipeline.id
should be enough
pruning old ancestors sounds like the right move for now.
Hmm, in that case you might need to write it. Doesnโt hurt trying eitherway
Hi @<1546303293918023680:profile|MiniatureRobin9> ! When it comes to pipeline from functions/other tasks, this is not really supported. You could however cut the execution short when your step is being ran by evaluating the return values from other steps.
Note that you should however be able to skip steps if you are using pipeline from decorators
Hi @<1668427963986612224:profile|GracefulCoral77> ! The error is a bit misleading. What it actually means is that you shouldn't attempt to modify a finalized clearml dataset (I suppose that is what you are trying to achieve). Instead, you should create a new dataset that inherits from the finalized one and sync that dataset, or leave the dataset in an unfinalized state
Yes, passing custom object between steps should be possible. The only condition is for the objects to be pickleable. What are you returning exactly from init_experiment
?
Your object is likely holding some file descriptor or something like that. The pipeline steps are all running in separate processes (they can even run on different machines while running remotely). You need to make sure that the objects that you are returning are thus pickleable and can be passed between these processes. You can try to see that the logger you are passing around is indeed pickalable by calling pickle.dump(s)
on it an then loading it in another run.
The best practice would ...
I meant the code where you upload an artifact, sorry
Actually, I think you want blop
now that you renamed the project (instead of custom pipeline logic
)
Try examples/.pipelines/custom pipeline logic
instead of pipeline_project/.pipelines/custom pipeline logic
Hi @<1523701949617147904:profile|PricklyRaven28> ! We released ClearmlSDK 1.9.1 yesterday. Can you please try it?
FiercePenguin76 Are you changing the model by pressing the circled button in the first photo? Are you promted with a menu like in the second photo?
Hi FreshParrot56 ! This is currently not supported ๐
That's unfortunate. Looks like this is indeed a problem ๐ We will look into it and get back to you.
Hi @<1676038099831885824:profile|BlushingCrocodile88> ! We will soon try to merge a PR submitted via Github that will allow you to specify a list of files to be added to the dataset. So you will then by able to do something like add_files(glob.glob(*) - glob.glob(*.ipynb))
this only affects single files, if you wish to add directories (with wildcards as well) you should be able to
You could try this in the meantime if you don't mind temporary workarounds:dataset.add_external_files(source_url="
", wildcard=["file1.csv"], recursive=False)
@<1523721697604145152:profile|YummyWhale40> are you able to manually save models from SageMaker using OutputModel
? None
QuaintJellyfish58 We will release later today an RC that adds the region to boto_kwargs
. We will ping you when it's ready to try it out
Hi BoredBat47 ! What jsonschema
version are you using?