"what's the trains/trains-agent/trains-server versions ?" how can I check it?
trains/trains-agent are pip packages os,pip freeze | grep trains
trains-server you can check in the /profile page top left corner
Hi AttractiveWoodpecker16
I think is the correct channel for that question.
(any chance you can move your thread there?)
Specifically just email billing@clear.ml they will cancel (no need to worry about the beginning of the month, just explain and they will not charge over Nov)
EDIT: I know they are working on making it a one click in the UI, main limit is what happens with the data that was stored and was above the free tier threshold, anyhow I think next version will sort that as well.
But pytorch has no specific backend, it uses TB.
No?! Can you point me to an example? What I mostly find is how to calc metrics not standard way to then store them...
Hi @<1603198134261911552:profile|ColossalReindeer77>
I would also check this one: None
The idea of queues is not to let the users have too much freedom on the one hand and on the other allow for maximum flexibility & control.
The granularity offered by K8s (and as you specified) is sometimes way too detailed for a user, for example I know I want 4 GPUs but 100GB disk-space, no idea, just give me 3 levels to choose from (if any, actually I would prefer a default that is large enough, since this is by definition for temp cache only), and the same argument for number of CPUs..
Ch...
Are you running it in venv mode or docker mode?
Also, How do I make the files other than entry script visible to the job?
The assumption for clearml (regradless on how you create a Task) is that you code is either a standlone script (or jupyter notebook) or inside a git repository. In case of a git repository cleamrl-agent will clone the git repository of the code, apply the uncommitted changes and run your code.
GiddyTurkey39 can you ping the server-address
(just making sure, this should be the IP of the server not 'localhost')
Hi @<1600661428556009472:profile|HighCoyote66>
However, we need to allocate resources to ourselves manually, using an
srun
command or
sbatch
Long story short, there is a full SLURM integration, basically you push a job into the ClearML queue and it produces a slurm job that uses the agent to setup the venv/container and run your Task, but this is only part of the enterprise version 😞
You can however do the following (notice this is ...
is this a config file on your side or something I can change, if we had enterprise version?
Yes, this is one of the things you can configure
Was going crazy for a short amount of time yelling to myself: I just installed clear-agent init!
oh noooooooooooooooooo
I can relate so much, happens to me too often that copy pasting into bash just uses the unicode character instead of the regular ascii one
I'll let the front-end guys know, so we do not make ppl go crazy 😉
link to the line please 🙂
Hi GreasyPenguin14
Sure you can, although a bit convoluted (I'll make sure we have a nice interface 🙂 )import hashlib title = hashlib.md5('epoch_accuracy_title'.encode('utf-8')).hexdigest() series = hashlib.md5('epoch_accuracy_series'.encode('utf-8')).hexdigest() task_filter = { 'page_size': 2, 'page': 0, 'order_by': ['last_metrics.{}.{}'.format(title, series)] } queried_tasks = Task.get_tasks(project_name='examples', task_filter=task_filter)
Hi TeenyFly97
Can I super-impose the graphs while comparing experiments?
Hmm not at the moment, I think someone asked for the option to control it, in both comparison mode and "standalone" mode.
There is a long discussion on this feature here:
https://github.com/allegroai/trains/issues/81#issuecomment-645425450
Feel free to chime in 🙂
I think that the latest agreement is a switch in the UI, separating or collecting (super-imposing) those graphs.
PompousBeetle71 just making sure, and changing the name solved it?
To auto upload the model you have to tell clearml to upload it somewhere, usually by passing output_uri to Task.init or setting the default_output_uri in the clearml.conf
EnviousPanda91 'connect' will log the object properties, the automagic logging is controlled in the Task.init call. Specifically Which framework produces metrics that are not logged? Your sample code manually reports some scalars/values, do you these as well?
Hi RipeGoose2
There is no need for any TrainsLogger in pytorch lightning as they switched to using the tensorboard logging by default, and everything they pass there we automagically catch.
What do you think is missing? or can be improved ?
In order to facilitate the multiple credentials one must use the Clearml SDK obviously.
Yes 🙂
using this is it possible to add to requirements of task with task_overrides?
Correct, but you will be replacing (not adding) requirements
Hi VexedElephant56
Yes it is:
Define CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
(if running in doecker mode add -e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1 as container args)
https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_env_var
Task.completed(ignore_errors=False)
What are you getting?
Tested with two sub folders, seems to work.
Could you please test with the latest RC:pip install clearml==0.17.5rc4
from your jupyterlab can you do:!curl
An upload of 11GB took around 20 hours which cannot be right.
That is very very slow this is 152kbps ...
Hi LazyTurkey38
Documentation for applications is currently worked on, generally speaking this is a way to package features available in ClearML with a UI interface. First these are going to be applications built by the ClearML team and later expanded for the community to be able to contribute to them. Finally users will be able to add their own applications (i.e. package Tasks with UI wizard and dashboard) in their hosted solutions. wdyt?
JitteryCoyote63 good news
not trains-server error, but trains validation error, this is easily fixed and deployed