Reputation
Badges 1
25 × Eureka!Could it be the code is not in a git repository ?clearml support either a single script or a git repository, but Not a collection of standalone files. wdyt?
Hi GreasyPenguin66
Is this for the client side ? If it is why not set them in the clearml.conf ?
PanickyMoth78 'tensorboard_logger' is an old deprecated package that meant to create TB events without TB, it was created before TB was a separate package. Long story short, it is not supported. That said if you just run the same code and replace tensorboard_logger with tensorboard, you should see all scalars in the UI
background:
ClearML logs TB events as they are created in real-time, TB_logger is not TB, it creates events and dumps them directly into a TB equivalent event file
you mean in the enterprise
Enterprise with the smarter GPU scheduler, this is inherent problem of sharing resources, there is no perfect solution, you either have fairness, but then you get idle GPU's of you have races, where you can get starvation
Hi
The Squash operation copies all the data and is no longer linked to previous commits?
Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed
I thought this operation is like git squash but it seems to me
yeah... we did not want to actually delete the parents because unlike git, the operation is done ...
Hi TightElk12
Are you looking for a way to set the output_uri from environment variable ? Is this it?
How can i get loaded model in Preporcess class in ClearML Serving?
ComfortableShark77
You mean your preprocess class needs a python package or is it your own module ?
LittleShrimp86 can you post the full log of the pipeline? (something is odd here)
- Suppose that the serving project A is serving some model version 1 and a new model is trained and it starts serving model version 2, but on runtime due to some reason reason we need to revert to model version 1, what would be the best way to achieve the above?
If you archive the model, then the cleaml-session will pick the "latest" non-archived model, essentially reverting to the previous version. Also notice that it supports multiple versions on a single endpoint (again also a feat...
Hi @<1564422644407734272:profile|DistressedCoyote60>
I have an ML project that is not on git. It is separated into several files:
Yes you are correct , if your code is composed of multiple files, you have to have a git repo for the agent to be able to run it.
😞
That said, it is free on an any of these services github/bitbucket/gitlab 🙂
Hi SarcasticSparrow10 , so yes it does, this is more efficient when using pytorch loaders, and in some other situations.
To disable it add to your clearml.conf:sdk.development.report_use_subprocess = false2. interesting error, maybe we can revert to "thread mode" if running under a daemon. (I have to admit, I'm not sure why python has this limitation, let me check it...)
task = Task.init(...) if task.running_locally(): # wait for the repo detection and requirements update task._wait_for_repo_detection() # reset requirements task._update_requirements(None)🙂
Can you please tell me if it is possible to set up slack monitoring in clearml?
It is 🙂
This one?
https://clear.ml/docs/latest/docs/guides/services/slack_alerts
Long story short, this is done internally when you call the Task.init (I think, there is a chance it is called before)
One way of controlling it would be to have something like:Task.init(auto_connect_frameworks={'hydra': {'log_before_resolve': True}})That said, I think it will be simpler to store both (in different section of course)
Maybe "Configuration Object: OmegaConf" and "Configuration Object: OmegaConfDefinition" ?
time.sleep(time_sleep)
You should not call time.sleep in async functions, it should be asyncio.sleep,
None
See if that makes a difference
PompousBeetle71 notice that starting with this version when you set model tags they will be stored as user tags , which you can change and edit in UI. So if you still need the system tags you have to access them directly.
My task starts up and checks the mounted EFS volume for x data, if x data does not exist there, it then pulls x data from S3.
BoredHedgehog47 you can just use StorageManager and configure clearml cache for the EFS, it will essentially do the same 🙂
Regrading helm chart with EFS,
you need to configure the clearml-glue pod template with the EFS mount
example :
https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/e7f647f4e6fc76f983d61522e635353005f1472f/examples/kubernetes/volu...
ExcitedSeaurchin87 can I assume in parallel means threads ?
Also, is this a single Dataset version download? at least in theory option (3) is the new default in the latest clearml version. wdyt?
PompousBeetle71 oh no 😞
okay this is a bit drastic, but let's see if it helps.
In your trains.conf, add the following section:loggers { loggers { trains { level: ERROR } } }
PompousBeetle71 kudos on the solution!
What were the loggers you ended up setting?
I'd like to make sure we fix this issue
Hi PompousBeetle71
Try this one, let me know if it helpedlogging.getLogger('trains.frameworks').setLevel(ERROR)
Hi DepressedFish57
In my case download each part takes ~5 second, and unzip ~15.
We run into that, and the new version will employ multithreading approach for the unzip (meaning the unzipping will happen in the background)
Oh no 😞 I wonder if this is connected to:
Any chance the logger is running (or you have) from a subprocess ?
Nice! TrickySheep9 any chance you can share them ?