Reputation
Badges 1
25 × Eureka!Hi
The Squash operation copies all the data and is no longer linked to previous commits?
Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed
I thought this operation is like git squash but it seems to me
yeah... we did not want to actually delete the parents because unlike git, the operation is done ...
Thanks SparklingHedgehong28
So I think I'm missing information on what you call "Instance protection" ?
You mean like respining spot instances ? or is it away to review the performance of AWS ASG (i.e. like a watchdog of a sort) ?
Hi MammothGoat53
Basically what you are missing are the headers with the Token you have:
https://blog.logrocket.com/secure-rest-api-jwt-authentication/
Notice the order here:Task.add_requirements("tensorflow") task = Task.init(...)
Hi GrittyCormorant73
At the end everything goes through session.send, you can add a print there?
btw: why would you print all the requests? what are we debugging here?
Meanwhile you can just sleep for 24hours and put it all on the services queue. it should work π
Example here:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
Hi SillySealion58
"keep N best checkpoints" logic in my training loop.
If this is the usecase, may I suggest overwriting them locally? (the same will happen on the remote storage) This is exactly how the lightning / ignite feature is implemented
Eg, i'm creating a task usingΒ
clearml.Task.create
Β , often it doesn't properly get the git diff correctly,
ShakyJellyfish91 Task.create does not store any "git diff" automatically, is there a reason not to use Task.init
?
EnviousStarfish54 a fix is already available in the latest RC
Could you verify it solves your issue as well?pip install trains==0.16.2rc0
CleanWhale17 nice ... π
So the answer is Trains supports the Pipeline / Automation of it, but lacks that dataset integration (that is basically up to you to manage, with either artifacts or any other method)
The Allegro Enterprise allows you to rerun the code, on a new version of the dataset from the UI (or automation) without changing a single line of code π
Hi @<1542316991337992192:profile|AverageMoth57>
Not sure I follow how the integration what you have in mind regarding Gerrit integration None
Sounds interesting ...
wdyt?
TBH ClearML doesn't seem to be picking the model up so I need to do it manually
This is odd, cleamrl will pick framework level serialization, but not just any pickle call
Why do I need an output_uri for the model saving? The dataset API can figure this out on its own
So that it knows where to upload it, if your are setting True
this will be the default files server, you can also set iy for shared files system, S3 GCP storage etc.
If no value is passed, it will just log th...
Yeah the hack would work but iβm trying to use it form the command line to put in airflow. Iβll post on GH
Oh, then set TMP/TMPDIR environment variable, it should have the same effect
Alright I have a followup question then: I used the param --user-folder β~/projects/my-projectβ, but any change I do is not reflected in this folder. I guess I am in the docker space, but this folder is not linked to my the folder on the machine. Is it possible to do so?
Yes you must make sure the docker can mount a persistent folder for you to work on.
Let me check what's the easiest way to do that
However, regarding your recommendation of using
StorageManager
class to delete the URL, it seems that this class only contains methods for checking existence of files, downloading files and uploading files, but
no method
for actually
deleting
files based on their URL (see doc
and
).
Yes you are correct π you should use a "deeper" class:
helper = StorageHelper.get(remote_url)
helper.delete(remo...
SpicyCrab51 you can change the task to complete, it is just a state change nothing will actually change other than the status. Task.get_task(pass_dataset_id_here).mark_complete()
Right, so this "vault" design is built into the paid tiers of ClearML to achieve exactly that. Long story short, users can put their credentials/configs on the clearml-server and the agent (or the clients) will pull and merge them into the execution.
It's very cool and works really nice, but not part of the open source (or the SaaS tier).
What you could do is store these configurations on the Task itself (one way o r another). Maybe for example have an empty definitions.py
file part of ...
RoundMosquito25 actually you can π# check the state every minute while an_optimizer.wait(timeout=1.0): running_tasks = an_optimizer.get_active_experiments() for task in running_tasks: task.get_last_scalar_metrics() # do something here
base line reference
https://github.com/allegroai/clearml/blob/f5700728837188d7d6005726c581c9d74fd91164/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L127
Hi ExcitedCat13
Sure, download the plugin from the git repo (Install instructions in the repo).
Regarding remote debugging, are referring to ssh ?
The plugin itself is designed to make sure that when you work on a remote machine with pycharm clearml will log the local git repo and changes (as the .git folder is not synced to the remote machine)
They don't give an in app notification.
Oh I see, I assume this is because the github account is not connected with any email, so no invite is sent.
Basically they should just be able to re-login and then they could switch to your workspace (with the link you generated)
Hi @<1560798754280312832:profile|AntsyPenguin90>
The image itself is uploaded in a blackground process, flush just triggers the starting of the process.
Could it be that it is showing a few seconds after?
Pseudo-ish code:
create pipelinepipeline = Task.create(..., task_type="controller") pipeline.mark_started() print(pipeline.id)
2. launch step A (pass arguments via command line argument / os environment)
` task = Task.init(...)
pipeline_id = os.environ['MY_MAIN_PIPELINE']
pipeline_task = Task.get_task(task_id=pipeline_id)
send some metrics / reports etc.
pipeline_task.get_logger().report_scalar(...)
pipeline_task.get_logger().report_text(...) `wdyt? (obvioudly you need to somehow pass th...
Hi FantasticPig28
or does every individual user have to configure their own minio credentials?
You can configure the clients files
entry in the clearml.conf (or use an OS environment)files_server: "
"
https://github.com/allegroai/clearml/blob/12fa7c92aaf8770d770c8ed05094e924b9099c16/docs/clearml.conf#L10
Notice to make sure you also provide credentials here:
https://github.com/allegroai/clearml/blob/12fa7c92aaf8770d770c8ed05094e924b9099c16/docs/clearml.conf#L97
Click on the Task it is running and abort it, it seems to be stuck, I guess this is why the others are not pulled
π Let me know if it solved the issue π
Hi AttractiveCockroach17
In your "Installed Packages" (when the task is in draft mode, you can edit it like any requirements.txt), you need to add:package @ git+
You can also make sure you have in in the first place bu addingTask.add_requirements("package", "@ git+
") task = Task.init(...)
Hi NonchalantGiraffe17
You mean this documentation?
https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksclone
BTW: any specific reason for going the RestAPI way and not using the python SDK ?
AttractiveCockroach17 can I assume you are working with the hydra local launcher ?
Okay, so I can't figure why it would "kill" the new experiments, I mean it should run them, but is there any "smart stopping" that causes it to kill he process before it ends ?
BTW: can this be reproduced with the clearml hydra example ?