The line before the last in your code snippet above. pipe.start_locally
.
This sounds like a use case for the enterprise version of ClearML. In it you can set read/write permissions. Publishing is considered a "write", so you can limit who can do it. Another thing that might be useful in your scenario is to try using "Reports", and connect the "approved" experiments info to a report and then publish it. Here's a short video introducing reports .
By the way, please note that if the experiment/report/whatever is publis...
Hello @<1533257278776414208:profile|SuperiorCockroach75> , thanks for asking. It’s actually unsupervised, because modern LLMs are all trained to predict next/missing words, which is an unsupervised method
What happens if you comment or remove the pipe.set_default_execution_queue('default')
and use run_locally
instead of start_locally
?
Because in the current setup, you are basically asking to run the pipeline controller task locally, while the rest of the steps need to run on an agent machine. If you do the changes I suggested above, you will be able to run everything on your local machine.
To link a dataset to a task you need to pass the alias=
parameter to the Dataset.get
. See here: https://clear.ml/docs/latest/docs/clearml_data/clearml_data_sdk#accessing-datasets
Hey @<1547390438648844288:profile|ScaryJellyfish75> , can you provide the whole code for the pipeline, and also mention what clearml version are you using?
Hey @<1526734437587357696:profile|ShaggySquirrel23> , what version of the clearml-agent are you using? Also, if I were you I’d check how much free disk there’s on the machine running the agents
Hey @<1554275802437128192:profile|CumbersomeBee33> , aborted usually means that someone manually stopped the pipeline or one of it's experiments. Can you provide us with the code you used to run it?
Hey Pawel, thanks for opening the PR on Ultralytics’ side. The full support should come from them, so if it’s missing for YOLOv8 it means they didn’t enable it. Still , you can try clearml-task
for auto-logging support in case of remote execution .
Also, I’d say you could easily have the possibility to use a ClearML dataset id as input to YOLOv8 with a few lines of code by basically downloading/ get
ing the dataset by id yourself and passing the path to it as input to the ultralytics...
Hey @<1569858449813016576:profile|JumpyRaven4> , about your first point, what exactly is the question?
About your second point - you can try to manually save the final model and give it a proper file name, that way we will show it in the UI with the name you provided. Make sure to use xgboost.save_model
and not raw pickle.
For your final question , given that your models have customised code, I can suggest trying to use clearml.OutputModel
which will register the file you provide ...
This sounds like you don't have clearml installed in the ubuntu container. Either this, or your clearml.conf
in the container is not pointing to the server, as a result all information is missing.
I'd rather suggest you change the approach, and run a clearml-agent
setup with docker
and when you want to run YOLOv5 training you actually execute it remotely on the queue that the agent is listening to
I see you want to use the services
queue for both the pipeline controller and pipeline steps, but you have only one worker/agent listening to this queue. In this case you need at least 2 agents listening to the services queue. Try spawning an additional agent that listens to this queue and let me know how it goes .
This is the method you're looking for None . But make sure you have a model saved on disk before using it. And if you don't want the model to be deleted from disk after it, make sure to set auto_delete_file=False
To my knowledge, no. You'd have to create your own front-end and use the model served with clearml-serving via an API
Hey @<1671689458606411776:profile|StormySeaturtle98> we do support something called "Model Design" previews, basically an architecture description of the model, a la Caffe protobufs. None For example we store this info automatically with Keras
Is this a jupyter notebook or something ? Can you download it properly as either a .ipynb or .py file?
Hey @<1577468626967990272:profile|PerplexedDolphin99> , yes, this method call will help you limit the number of files you have in your cache, but not the total size of your cache. To be able to control the size, I’d recommend checking the ~/clearml.conf
file in the sdk.storage.cache
section
Wait, my config looks a bit different, what clearml package version are you using?
Hey @<1523701949617147904:profile|PricklyRaven28> , about the S3 loading issue. The path to the model in the artifact tab, is it an S3 bucket or a local path?
Hey, yes, the reason for this issue seems to be our currently limited support for lightning 2.0. We will improve the support in the following releases. Right now one way to circumvent this issue, that I can recommend, is to use torch.save
if possible, because we fully support automatic model capture on torch.save
calls.
Can you please check with the latest 1.10.2 SDK version if the checkpointing issue still happens. As for the example code which couldn't be reproduced, we're already working on it and should have a fix for it for the next minor SDK version
clearml-data
also supports glob patterns, so if you have your dataset files in the same directory as the experiment code, you can do something like clearml-data add --files *.csv
and only add the CSV files.
There's no .gitignore-like functionality because clearml-data
is not meant to track everything, and you need to be deliberate in what exactly you're adding. Hope this clarifies things.
This is doing fine-tuning. Training a multi-billion parameter model from scratch would be economically unfeasible for most of existing enterprises
Can you please attach the code for the pipeline?
Hey @<1639799308809146368:profile|TritePigeon86> , given that you want to retry on connection error, wouldn't it be easier to use retry_on_failure
from PipelineController
/ PipelineDecorator.pipeline
None ?
Ok, then launch an agent using clearml-agent daemon --queue default
that way your steps will be sent to the agent for execution. Note that in this case, you shouldn't change your code snippet in any way.
That is not specific enough. Can you show the code? And ideally also the console log of the pipeline
Hey @<1523704757024198656:profile|MysteriousWalrus11> , given your use case, did you consider passing the path to the dataset? Like an address to an S3 bucket