Reputation
Badges 1
2 × Eureka!Are you running a self-hosted/enterprise server or on app.clear.ml? Can you confirm that the field in the screenshot is empty for you?
Or are you using the SDK to create an autoscaler script?
If you didn't use git, then clearML saves your .py
script completely in the uncommited changes
section like you say. You should be able to just copy paste it to get the code. In what format are your uncommited changes logged? Can you paste a screenshot or paste the contents of uncommitted changes
?
Now worries! Just so I understand fully though: you were already using the patch with success from my branch. Now that it has been merged into transformers main branch you installed it from there and that's when you started having issues with not saving models? Then installing transformers 4.21.3 fixes it (which should have the old clearml integration even before the patch?)
You can apply git diffs by copying the diff to a file and then running git apply <file_containing_diff>
But check this thread to make sure to dry-run first, to check what it will do, before you overwrite anything
https://stackoverflow.com/questions/2249852/how-to-apply-a-patch-generated-with-git-format-patch
That wasn't my intention! Not a dumb question, just a logical one 😄
As long as your clearml-agents have access to the redis instance it should work! Cool usecase though, interested to see how well it would work 🙂
That's what happens in the background when you click "new run". A pipeline is simply a task in the background. You can find the task using querying and you can clone it too! It is places in a "hidden" folder called .pipelines
as a subfolder on your main project. Check out the settings, you can enable "show hidden folders"
Also, please note that since the video has been uploaded, the dataset UI has changed. So now a dataset will be found under the dataset tab on the left instead of in the experiment manager 🙂
Thank you so much ExasperatedCrocodile76 , I'll check it tomorrow 🙂
I tried answering them as well, let us know what you end up choosing, we're always looking to make clearml better for everyone!
I see. Are you able to manually boot a VM on GCP and then manually SSHing into it and running the docker login command from there? Just to be able to cross out networking or permissions as possible issues.
Maybe you can add https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelinecontroller/#set_default_execution_queue to your pipelinecontroller, only have the actual value be linked to a pipeline parameter? So when you create a new run, you can manually enter a queue name and the parameter will be used by the pipeline controller script to set the default execution queue.
The point of the alias is for better visibility in the Experiment Manager. Check the screenshots above for what it looks like in the UI. Essentially, setting an Alias makes sure the task that is getting the dataset automatically logs the ID that it gets using Dataset.get()
. The reason being that if you later on look back to your experiment, you can also see what dataset was .get()
't back then.
ExuberantBat52 When you still get the log messages, where did you specify the alias?...
Not exactly sure what is going wrong without an exact error or reproducible example.
However, passing around the dataset object is not ideal, because passing info from one step to another in a pipeline requires ClearML to pickle said object and I'm not exactly sure a Dataset obj is picklable.
Next to that, running get_local_copy() in the first step does not guarantee that you can access that data from the other step. Both might be executed in different docker containers or even on different...
If that's true, the error should be on the combine function, no? Do you have a more detailed error log or minimal reproducible example?
Hey @<1539780305588588544:profile|ConvolutedLeopard95> , unfortunately this is not built-in into the YOLOv8 tracker. Would you mind opening an issue on the YOLOv8 github page and atting me? (I'm thepycoder on github)
I can then follow up the progress on it, because it makes sense to expose this parameter through the yaml.
That said, to help you right now, please change [this line](https://github.com/ultralytics/ultralytics/blob/fe61018975182f4d7645681b4ecc09266939dbfb/ultralytics/yolo/uti...
Yeah, I do the same thing all the time. You can limit the amount of tasks that are kept in HPO with the save_top_k_tasks_only
parameter and you can create subprojects by simply using a slash in the name 🙂 https://clear.ml/docs/latest/docs/fundamentals/projects#creating-subprojects
Hi ThoughtfulGrasshopper59 !
You're right, we should probably add the convenient allow_archived
function in .get_task
s
()
as well.
That said, for now this can be a workaround:
` from clearml import Task
print([task.name for task in Task.get_tasks(
project_name="TAO Toolkit ClearML Demo",
task_filter=dict(system_tags=['archived'])
)]) Specifically
task_filter=dict(system_tags=['archived']) ` should be what you need.
Oohh interesting! Thanks for the minimal example though. We might want to add it to the docs as an example of dynamic DAG creation 🙂
Great to hear! Then it comes down to waiting for the next hugging release!
Ah I see 😄 I have submitted a ClearML patch to Huggingface transformers: None
It is merged, but not in a release yet. Would you mind checking if it works if you install transformers from github? (aka the latest master version)
Yes you can! The filter syntax can be quite confusing, but for me it helps to print task.__
dict__
on an existing task object to see what options are available. You can get values in a nested dict by appending them into a string with a .
Example code:
` from clearml import Task
task = Task.get_task(task_id="17cbcce8976c467d995ab65a6f852c7e")
print(task.dict)
list_of_tasks = Task.query_tasks(task_filter={
"all": dict(fields=['hyperparams.General.epochs.value'], p...
Can you try setting the env variables to 1
instead of True
? In general, those should indeed be the correct variables to set. For me it works when I start the agent with the following command:
CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1 CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1 clearml-agent daemon --queue "demo-queue"
Hi EmbarrassedSpider34 , would you mind showing us a screenshot of your machine configuration? Can you check for any output logs that ClearML might have given you? Depending on the region, maybe there were no GPUs available, so could you maybe also check if you can manually spin up a GPU vm?
Isitdown seems to be reporting it as up. Any issues with other websites?
RoundMosquito25 it is true that the TaskScheduler
requires a task_id
, but that does not mean you have to run the pipeline every time 🙂
When setting up, you indeed need to run the pipeline once, to get it into the system. But from that point on, you should be able to just use the task_scheduler on the pipeline ID. The scheduler should automatically clone the pipeline and enqueue it. It will basically use the 1 existing pipeline as a "template" for subsequent runs.
Hi ExasperatedCrocodile76 ,
You can try running the agent with these environment variables set to 1:
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1 CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
There's more env vars here: https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_env_var
Does that work for you?
The built in HPO uses tags to group experiment runs together and actually use the original optimizer task ID as tag to be able to quickly go back and see where they came from. You can find an example in the ClearML Examples project.
To be honest, I'm not completely sure as I've never tried hundreds of endpoints myself. In theory, yes it should be possible, Triton, FastAPI and Intel OneAPI (ClearML building blocks) all claim they can handle that kind of load, but again, I've not tested it myself.
To answer the second question, yes! You can basically use the "type" of model to decide where it should be run. You always have the custom model option if you want to run it yourself too 🙂