Reputation
Badges 1
25 × Eureka!For example:examples/k8s_glue_example.py --queue k8s_gpu - --namespace pod-clearml-conf ~/trains.conf --template-yaml example/base.yml
Another issue that might be the case, might be that I'm on ubuntu some of the packages might've been for windows thus the different versions not existing
Usually this is not the case, the version number match (implementation wise it might be a different file, but it is almost always a matching version)
Do you have a roadmap which includes resolving things like this
Security SSO etc. is usually out of scope for the open-source platform as it really makes the entire thing a lot harder to install and manage. That said I know that on the Enterprise solution they do have SSO and LDAP support and probably way more security features. I hope it helps 🙂
Hi @<1687653458951278592:profile|StrangeStork48>
secrets manager per se,
Quick question, are you running the trains-server over http or https ?
Hmm I think you have a point here, the confusing part is the cp cmd. Can you send the full log? (Regradless , can I assume you are running a rootless container ?)
we need to evaluate the result across many random seeds, so each task needs to log the result independently.
Ohh that kind of makes sense to me 🙂
Yes I'm also getting:
/usr/local/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 74 leaked semaphores to clean up at shutdown
len(cache))
Not sure about that ...
does the clearml server is a worker i can serve on models?
The serving is done by one of the clearml-agents.
Basically you spin an agent, then this agent is spinning the model serving engine container (fully managed).
(1) install run run clearml-agent (2) run clearml-session CLI to configure and spin the serving engine
What's the exact error you are getting ?
(Maybe this is privilege error on the cache folder, what are the folders it is using, you can see in the configuration as well)
GreasyPenguin14 I think the default is reporting on failed tasks only? could that be?
@<1523701868901961728:profile|ReassuredTiger98> what do you have in the clearml.conf under "conda_channels" ?
Is this it ?
None
PompousParrot44 these are the default plotly colors. You can change any of the layout properties with the
https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/trains/logger.py#L600
Hmm what do you have here?
os.system("cat /var/log/studio/kernel_gateway.log")
Is the agent idle ? it is running something else ?
If you one each "main" process as a single experiment, just don't call Task.init in the scheduler
(torchvision vs. cuda compatibility, will work on that),
The agent will pull the correct torch based on the cuda version that is available at runtime (or configured via the clearml.conf)
We're not using a load balancer at the moment.
The easiest way is to add ELB and have amazon add the httpS on top (basically a few clicks on their console)
Closing the data doesnt work: dataset.close() AttributeError: 'Dataset' object has no attribute 'close'
Hi @<1523714677488488448:profile|NastyOtter17> could you send he full exception ?
logger.report_scalar("loss", "train", iteration=0, value=100)logger.report_scalar("loss", "test", iteration=0, value=200)
I think I found something, let me dig deeper 🙂
BTW: how did it get there ?
I want to store only my raw data in my blob storage, and I want to create a Hyperdataset with all the artificats, metrics, frames,
Yes that's exactly how it works.
None
This line adds a reference to raw file (local/remote)
[https://github.com/allegroai/clearml/blob/1b474dc0b057b69c76bc2daa9eb8be927cb25efa[…]es/hyperdatasets/data-registration/register_dataset_wit...
Thanks PompousBaldeagle18 !
Which software you used to create the graphics?
Our designer, should I send your compliments 😉 ?
You should add which tech is being replaced by each product.
Good point! we are also missing a few products from the website, they will be there soon, hence the "soft launch"