Reputation
Badges 1
25 × Eureka!Thanks JitteryCoyote63 !
Any chance you want to open github issue with the exact details or fix with a PR ?
(I just want to make sure we fix it as soon as we can π )
I'm sorry JitteryCoyote63 No π
I do know that the enterprise addition have these features (a.k.a vault & permissions), basically to answer these types of situations.
Basically if I pass an arg with a default value of False, which is a bool, it'll run fine originally, since it just accepted the default value.
I think this is the nargs="?"
, is that right ?
Yep, and this is the root cause of the issue (But easily fixable) π
ValueError('Task object can only be updated if created or in_progress')
It seems the task
is not "running" hence the error, could that be
how can I for example convert it back to a pandas dataframe?
You can always report csv file with report_media as well, or if this is not for debugging maybe an artifact ?
JitteryCoyote63 Hmmm in theory, yes.
In practice you need to change this line:
https://github.com/allegroai/clearml/blob/fbbae0b8bc933fbbb9811faeabb9b6d9a0ea8d97/clearml/automation/aws_auto_scaler.py#L78
` python -m clearml_agent --config-file '/root/clearml.conf' daemon --queue '{queue}' {docker} --gpus 0 --detached
python -m clearml_agent --config-file '/root/clearml.conf' daemon --queue '{queue}' {docker} --gpus 1 --detached
python -m clearml_agent --config-file '/root/clearml.conf' d...
JitteryCoyote63 I think that with 0.17.2 we stopped mounting the venv build to the host machine. Which means it is all stored inside the docker.
instead of terminating them once they are inactive, so that they could be available immediately when they are needed.
JitteryCoyote63 I think you can increase the IDLE timeout on the autoscaler, and achive the same behavior, no ?
SmarmySeaurchin8 yes, the package containing the Controller is only RC, plan is to release the stable one in a couple of days. In the meantime:pip install git+
Interesting... TrickyRaccoon92 could it be the validation phase was creating a new Tensorboard file ?
Hi LazyTurkey38
, is it possible to have the agents keep a local version and only download the diff of the job commit to speed things up?
This is what it does, it has a local cached copy and it only pulls the latest changes
@<1577468638728818688:profile|DelightfulArcticwolf22>
How can I tell clearml-agent not to run pip install unless my requierments.txt file was changed.
the agent has built in cache, it will reuse the previous venv if nothing changed (cache local on the agent's machine).
Make sure this is line is not commented :
None
If you have idea on where to start looking for a quick win, I'm open to suggestions π
@<1587253076522176512:profile|HollowPeacock33>
Is this a commercial ad? this seems like out of scope for this channel
Can you expand?
worker nodes are bare metal and they are not in k8s yet
By default the agent will use 10022 as an initial starting port for running the sshd that will be mapped into the container. This has nothing to do with the Host machine's sshd. (I'm assuming agent running in docker mode)
Number of entries in the dataset cache can be controlled via cleaml.conf : sdk.storage.cache.default_cache_manager_size
SmarmySeaurchin8
Something like this one:vector_series = np.random.randint(10, size=10).reshape(2,5) logger.report_vector(title='vector example', series='vector series', values=vector_series, iteration=0, labels=['A','B'], xaxis='X axis label', yaxis='Y axis label')
Hi @<1557899668485050368:profile|FantasticSquid9>
There is some backwards compatibility issue with 1.2 (I think).
Basically what you need it to spin a new one on a new session ID and rergister the endpoints
(basically python abusing types/casting where the value can be both str/bool on the same argparser aergument)
Hi GrittyKangaroo27
Maybe check the TriggerScheduler , and have a function trigger something on k8s every time you "publish" a model?
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py
But adding a simpleΒ
force_download
Β flag to theΒ
get_local_copy
That's sounds like a good idea
Hi @<1547028074090991616:profile|ShaggySwan64>
I'm guessing just copying the data folder with rsync is not the most robust way to do that since there can be writes into mongodb etc.
Yep
Does anyone have experience with something like that?
basically you should just backup the 3 DBs (mongo, redis, elastic) each one based on their own backup workflows. Then just rsync the files server & configuration.
GrievingTurkey78 where do you see this message? Can you send the full server log
?
are you referring to the same line? 47 in cache.py?