
Reputation
Badges 1
25 × Eureka!Hi FunnyTurkey96
Any chance you can try to run with the latest form GitHub (i just tested your code and it seemed to work on my machine).pip install git+
Hi @<1730396272990359552:profile|CluelessMouse37>
However, the caching doesn't seem to be working correctly. Despite not changing the configuration, the first step runs every time.
How are you creating the cached component?
is this a standalone script or a git repo link?
These parameters are dictionaries of specific configurations (dict of dict) that are the same but might not be taken into account properly by the caching mechanism.
hmm for the component to be cached (or reuse...
Hmm, so what I'm thinking is "extending" the capabilities of the "configuration" section (as it seems this is the right context). Allowing to upload a bunch of files (with the same mechanism as artifacts), as zip files, in the configuration "editable" section have the URL storing the zip, together with the target folder. wdyt?
JitteryCoyote63 The release was delayed due a last minute issue, should be released later today. Anyhow the code is updated on GitHub, so you can start implementing :) let me know if I can be of help :)
Is there a way to force clearml not to upload these models?
DistressedGoat23 is it uploading models or registering them? to disable both set auto_connect_frameworks https://clear.ml/docs/latest/docs/clearml_sdk/task_sdk#automatic-logging
Their name only contain the task name and some unique id so how can i know to which exact training
You mean the models or the experiments being created ?
(with older clearml versions thoughβ¦).
Yes, we added content type header for the files when uploading to S3 (so it is easier for users to serve them back). But it seems the python 3.5 casting from Path to str breaks it mimetype call....
@<1539780258050347008:profile|CheerfulKoala77> make sure the AMI id matches the zone of the EC2 machine
I got everything working using the default queue. I can submit an experiment, and a new GPU node is provisioned, all good
Nice!
My next question, how do I add more queues?
You can create new queues in the UI and spin a new glue for the queue (basically think of a queue as an abstraction for a specific type of resource)
Make sense ?
none of my pipeline tasks are reporting these graphs, regardless of runtime. I guess this line would also fix that?
Same issue, that said, good point, maybe with pipeline we should somehow make that a default ?
LovelyHamster1 NICE! π
@<1535793988726951936:profile|YummyElephant76> oh you mean like jupyter server was running, then inside the notebook you would start a new venv, in that venv "notebook" package was missing, hence it failed detecting the notebook ?
JitteryCoyote63 Is this an Ignite feature ? what is the expectation ? (I guess the ClearML Logger just inherits from the base ignite logger)
Can you see it on the console ?
BTW: how is it missing listing torch
? Do you have "import torch" in the code ?
So like a UI for creating pipelines doing different things on the different solutions ?
So this is verry odd, it looks like a pip bug:
The agent is trying to install torch==2.1.0.*
because by default it ignores the 4th+ parts (they are unstable and torch have tendency to remove them) . and for some reason pip will not match 2.1.0.*
with for example "2.1.0.dev20230306+cu118"
but based on the docs it should work:
see here: None
As a workaround you can always edit and change to the final url for example: so ...
GrittyHawk31 by default any user can login (i.e. no need for password), if you want user/password access:
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_config/#web-login-authentication
Notice no need to have anything else in the apiserver.conf
, just the user/pass section, everything else will just be the default values.
BTW: Can you also please test with the latest clearml version , 1.7.2
So was the issue solved?
See if this helps
WARNING:root:Could not lock cache folder /home/ronslos/.clearml/venvs-cache: [Errno 11] Resource temporarily unavailable
Hi @<1549927125220331520:profile|ZealousHare78>
could it be you are also working on the same machine ? are you running the agent in docker mode or venv mode ?
Hi @<1546665638829756416:profile|LovelyChimpanzee39>
anyone know what params I need to pass in order to enable it?
we feel you there π this is an actual plotly feature that we really want to add, but kind of out of our hands: None
feel free to support as there π€
Hey IntriguedRat44 ,
Is this what you are after?
https://github.com/allegroai/trains/issues/181
Let me see if I can reproduce something
Hi @<1720249421582569472:profile|NonchalantSeaanemone34>
Is it possible to read data directly from server w/o using get_local_copy()?
do you mean an artifact ? what is direct here?
because comparing experiments using graphs is very useful. I think it is a nice to have feature.
So currently when you compare the graphs you can select the specific scalars to compare, and it Update in Real Time!
You can also bookmark the actual URL and it is fully reproducible (i.e. full state is stored)
You can also add custom columns to the experiment table (with the metrics) and sort / filter based on them, and create a summary dashboard (again like ll pages in the web app, URL is...
However, once I extract the zips (or download the dataset through Python API or CLI) not all the files are there.
and all the files are registered in the metadata? coulf you add --verbose
to the sync command to see what it is doing
"clearml-data add --folder ./*" seems to fix this issue though it doesn't preserve my directory structure
This is also odd, it should Not flatten the folder structure. What is your OS / Python / clearml version?
Is this reproducible ? if so, how ...
single task in the DAG is an entire ClearML
pipeline
.
just making sure detials are not lost, "entire ClearML pipeline ." : the pipeline logic is process A running on machine AA.
Every step of that pipeline can be (1) subprocess, but that means the exact same environement is used for everything, (2) The DEFAULT behavior, each step B is running on a different machine BB.
The non-ClearML steps would orchestrate putting messages into a queue, doing retry logic, and tr...