Last but not least - can I cancel the offline zip creation if I'm not interested in it
you can override with OS environment, would that work?
Or well, because it's not geared for tests, I'm just encountering weird shit. Just calling
task.close()
takes a long time
It actually zips the entire offline folder so you can later upload it. Maybe we can disable that part?!
` # generate the script section
script = (
"fr...
I can't seem to find a difference between the two, why would matplotlib get listed and pandas does not... Any other package that is missing?
BTW: as an immediate "hack" , before your Task.init call add the following:Task.add_requirements("pandas")
Yes that would work π
You can also put it in the docker compose see TRAINS_AGENT_DEFAULT_BASE_DOCKER
Thanks OutrageousGrasshopper93
I will test it "!".
By the way the "!" is in the project or the Task name?
The idea of queues is not to let the users have too much freedom on the one hand and on the other allow for maximum flexibility & control.
The granularity offered by K8s (and as you specified) is sometimes way too detailed for a user, for example I know I want 4 GPUs but 100GB disk-space, no idea, just give me 3 levels to choose from (if any, actually I would prefer a default that is large enough, since this is by definition for temp cache only), and the same argument for number of CPUs..
Ch...
We just donβt want to pollute the server when debugging.
Why not ?
you can always remove it later (with Task.delete) ?
as a backup plan: is there a way to have an API key set up prior to running docker compose up?
Not sure I follow, the clearml API pair is persistent across upgrades, and the storage access token are unrelated (i.e. also persistent), what am I missing?
Are you using tensorboard or do you want to log directly to trains ?
Hi @<1536881167746207744:profile|EnormousGoose35>
, Could we just share the entire project instead of Workspace ?
You mean allow access to a project between workspaces ?
If the answer is yes, then unfortunatly the SaaS version (app.clear.ml) does not really support these level of RBAC, this is part of the enterprise version, which assumes a large organization with the need for that kind of access limit.
What is the use case ? Why not just share the entire workspace ?
I still wonder how no one noticed ... (maybe 100 unique title/series report is relatively high threshold)
Yes that is an issue for me, even if we could centralize an environment today, it leaves a concern whenever we add a model that possible package changes are going to cause issues with older models.
yeah changing the environment on the fly is tricky, it basically means spinning an internal http service per model...
Notice you can have many clearml-serving-sessions, they are not limited, so this means you can always spin new serving with new environments. The limitation is changing an e...
FiercePenguin76
So running the Task.init from the jupyter-lab works, but running the Task.init from the VSCode notebook does not work?
I would like to put table with url links and image thumnails.
StraightParrot3 links will work inside table (your code sample looks like the correct way to add them), but I think plotly (which is the UI package that displays the table) does not support embedding images into tables π
When they add it, the support will be transparent and it would work as you expect
But this config should almost never need to change!
Exactly the idea π
notice the password (initially random) is also fixed on your local machine, for the exact same reason
NVIDIA_VISIBLE_DEVICES=0,1
Basically it is uses "as is" and Nvidia drivers do the rest
Same goes for all or 0-3 etc.
NastySeahorse61 it might that the frequency it tests the metric storage is only once a day (or maybe half a day), let me see if I can ask around
(just making sure you can still login to the platform?)
JitteryCoyote63 Should be quite safe, there is no major change that I'm aware of on the ClearML side that can effect it.
That said, wait for after the weekend, we are releasing a new ClearML package, I remember there was something with the model logging, it might not directly have something to do with ignite, but worth testing on the latest version.
so what should the value of "upload_uri" to set to,Β
fileserver_url
Β e.g.Β
Β ?
yes, that would work.
Apparently the error comes when I try to access from
get_model_and_features
the pipeline component
load_model
. If it is not set as pipeline component and only as helper function (provided it is declared before the components that calls it (I already understood that and fixed, different from the code I sent above).
ShallowGoldfish8 so now I'm a bit confused, are you saying that now it works as expected ?
. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet
The odd thing is that you can access the notebook, but it returns zero kernels ..
Hi @<1785479228557365248:profile|BewilderedDove91>
It's all about the databases in the under the hood, so 8gb is really a must
Do you mean it recently become part of enterprise version?
I do not think so, but it seems this the support for the open-source is more like a PoC
https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py
Is there a way to document these non-standard entry points
@<1541954607595393024:profile|BattyCrocodile47> you should see the "run" in the Args section under Configuration
in case of HF you should see "-m huggingface" and then the rest in the Args section
(if this does not work, then I assume this is a bug π )
The idea is of course that you can always enqueue and reproduce, so if that part is broken we should fix it π
Hi Team,Can i clone experiment shared by some one, via link?
You mean someone that is not in your workspace ? (I'm assuming app.clear.ml ?)
orchestration module
When you previously mention clone the Task I the UI and then run it, how do you actually run it?
regarding the exception stack
It's pointing to a stdout that was closed?! How could that be? Any chance you can provide a toy example for us to debug?
trains-agent build --docker nvidia/cuda --id myTaskId --target base_env_services
It's building a gpu enabled docker...
you might want a diff container or to specific --cpu-only
AntsyElk37
and when i try to use --output-uri i can't pass true because obviously i can't pass a boolean only strings
hmm, that sounds right, I think we should fix that so when using --output-uri true the value that is passed is actually True, not the string "true".
Regrading the issue itself:
are you saying --skip-task-init is being ignored ? and it always adds the Task.init call? you can also pass --output-uri https://files.clear.ml (which is the same as True) ,...
We are planning an RC later this week, I'll make sure this fix is part of it