LethalCentipede31 I think seaborn is using matplotlib, it should just work:
https://github.com/allegroai/clearml/blob/6a91374c2dd177b7bdf4c43efca8e6fb0d432648/examples/frameworks/matplotlib/matplotlib_example.py#L48
What is the proper way to change a clearml.conf ?
inside a container you can mount an external clearml.conf, or override everything with OS environment
https://clear.ml/docs/latest/docs/configs/env_vars#server-connection
Maybe this one?
https://github.com/allegroai/clearml/issues/448
I think it is already there (i.e. 1.1.1)
Hi @<1684010629741940736:profile|NonsensicalSparrow35>
But the provided command is missing the url target for the curl so it is not complete.
Not sure I followed. did you specify "NEW_ADDRESS" ?
or is it the in both cases the URL is locahost ?
Yes it is reproducible do you want a snippet?
Already fixed 🙂 please ping tomorrow, I think an RC should be out soon with the fix
(only works for pyroch because they have diff wheeks for diff cuda versions)
(the payload is not the correct form, can that be a problem?
It might, but I assume you will get a different error
- try with the latest RC
1.8.1rc2
, it feels like after git clone, it spend minutes without outputting anything
yeah that is odd , can you run the agent with --debug (add before the daemon
command) , and then at the end of the command add --foreground
Now launch the same task on that queue, you will have a verbose log in the console.
Let us know what you see
I think the limit is a few GB, I'm not sure, I'll have to check
And yes the oldest experiments will be deleted first (with the exception of published experiments, they will be deleted last)
It should be the last line (or almost) of the Log. is it there ? Also it seems that from the log, that trains you are using trains 0.14.3 , try with trains 0.15 , let me know if you are still missing packages
My question is, which version do you need docker compose?
Ohh sorry, there is no real restriction, we just wanted easy copy-paste for the installation process.
I suspect it failed to create one on the host and then mount into the docker
command line 🙂
cmd.exe / bash
Hi JitteryCoyote63
Just making sure, the package itself it installed as part of the "Installed packages", and it also installs a command line utility ?
This is assuming you can just run two copies of your code, and they will become aware of one another.
clearml doesn’t do any “magic” in regard to this for tensorflow, pytorch etc right?
No 😞 and if you have an idea on how, that will be great.
Basically the problem is that there is no "standard" way to know which layer is in/out
Sadly, I think we need to add another option like task_init_kwargs
to the component decorator.
what do you think would make sense ?
MelancholyElk85 notice there is the pipeline controller queue (i.e. which agent will run the logic of the pipeline), and the default queue for the pipeline steps (i.e. the actual steps of the pipeline).
The default queue for the pipeline logic itself is services
. you can change it ( pipeline.start(..., queue='another_q')
)
Make sense ?
If I call explicitly
task.get_logger().report_scalar("test", str(parse_args.local_rank), 1., 0)
, this will log as expected one value per process, so reporting works
JitteryCoyote63 and do prints get logged as well (from all processes) ?
FierceHamster54 are you sure you have write permissions ?
okay so it is downloaded to your machine, and unzipped , is that part correct?
f I log 20 scalars every 2000 training steps and train for 1 million steps (which is not that big an experiment), that's already 10k API calls...
They are batched together, so at least in theory if this is fast you should not get to 10K so fast, But a Very good point
Oh nice! Is that for all logged values? How will that count against the API call budget?
Basically this is the "auto flush" it will flash (and batch) all the logs in 30sec period, and yes this is for all the logs (...
Funny enough I’m running into a new issue now.
Sorry my bad, I thought have known 😉 yes it probably should be packages=["clearml==1.1.6"]
BTW: do you have any imports inside the pipeline function itself ? if you do not, then no need to pass "packages" at all, it will just add clearml
The problem is that even when I mount the SSH key into the root home directory (e.g.,
/root/.ssh/id_rsa
with the correct permissions set to 400) I still encounter the same error.
The agent automatically mount's the .ssh folder from the host into the container, making sure all the permissions are set,
how can I run
pip install -e .
in general the agent will add the "working" dir into the PYTHONPATH so that you should not have to manually do "-e ."
Tha...
Hi @<1664079296102141952:profile|DangerousStarfish38>
You mean spin the agent on multiple Windows machines? Yes that is supported, I think that it is limited to venv (i.e. not docker) mode, but other than that should work out of the box
That wasn't scheduled by ClearML).
This means that from Clearml perspective they are "manual" i.e the job it self (by calling Task.init) create the experiment in the system, and fills in all the fields.
But for a k8s job, I'm still unsuccessful.
HelpfulDeer76 When you say "unsuccessful" what exactly do you mean ?
Could it be they are reported to the clearml demo server (the default server if no configuration is found) ?