Reputation
Badges 1
25 × Eureka!However, I have not yet found a flexible solution other than ssh-agent forwarding.
And is it working?
It seems like the web server doesnβt log the call to AWS, I just see this:
This points to the browser actually sending the AWS delete command. Let me check with FE tomorrow
Hi @<1673501397007470592:profile|RelievedDuck3>
how can I configure my alerts to be notified when the distribution of my metrics (variables) changes on my heatmaps?
This can be done inside grafana, here is a simple example:
None
Specifically you need to create a new metric that is the distance of current distribution (i.e. heatmap) from the previous window), then on the distance metric, ...
My bad, there is a mixture in terms.
"configuration object" is just a dictionary (or plain text) stored on the Task itself.
It has no file representation (well you could get it dumped to a file, but it is actually stored a s a blob of text on the Task itself, at the backend side)
Checkout the trains-agent repo https://github.com/allegroai/trains-agent
It is fairly straight forward.
Thanks CharmingShrimp37 !
Could you PR the fix ?
It will be just in time for the 0.16 release π
Yes the clearml-server AMI - we want to be able to back it up and encrypt it on our account
I think the easiest and safest way for you is to actually have full control over the AMI, and recreate once from scratch.
Basically any ubuntu/centos + docker and docker-compose should do the trick, wdyt ?
That is a bit odd, But SSH keys have to have a specific chmod flags for them to work (security issues)
What was the error ?
Woot woot π
Thanks HelpfulHare30 , I would love know know what you find out, please feel free to share π
Thanks @<1694157594333024256:profile|DisturbedParrot38> !
Nice catch.
Could you open a github issue so that at least we output a more informative error?
If nothing specific comes to mind i can try to create some reproducible demo code (after holiday vacation)
Yes please! π
In the mean time see if the workaround is a valid one
Hi RoughTiger69
One quirk I found was that even with this flag on, the agent decides to install whatever is in the requirements.txt
Whats the clearml-agent you are using?
I just noticed that even when I clear the list of installed packages in the UI, upon startup, clearml agent still picks up the requirements.txt (after checking out the code) and tries to install it.
It can also just skip the entire Python installation with:CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
ideally, I want to hardcode, e.g. use_staging = True, enqueue it; and then via clone-edit_user_properties-enqueue in UI start the second instance (edited)
Oh I see!
Actually the easiest would be to use a Section:
` task = Task.init(...)
my_params = {'use_staging': True}
task.connect(my_params, name="General")
if my_params['use_staging']:
# do something
scheduler = TaskScheduler(...) `wdyt?
os.environ['TRAINS_PROC_MASTER_ID'] = '1:da0606f2e6fb40f692f5c885f807902a' os.environ['OMPI_COMM_WORLD_NODE_RANK'] = '1' task = Task.init(project_name="examples", task_name="Manual reporting") print(type(task))
Should be: <class 'trains.task.Task'>
Yes that should work, only thing is you need to call Task init on the master process (and make sure you call Task.current_task() on the subprocesses, if you want to automagic to kick in, that said, usually there is no need, they are supposed to report everything back to the main one anyhow
basically
` @call_parse
def main(
Β Β gpus:Param("The GPUs to use for distributed training", str)='all',
Β Β script:Param("Script to run", str, opt=False)='',
Β Β args:Param("Args to pass to script", nargs=...
PricklyRaven28 basically this is the issue:
python -m fastai.launch <script>
There are multiple copies of the script running, but they are Not aware of one another.
are you getting any reporting from the diff GPUs? I'm assuming there is a hidden OS environment that signals the "master" node, so all processes can communicate with it. This is what we should automatically capture. There is a workaround the fastai.launch, that is probably similar to this one:
when u say useΒ
Task.current_task()
Β you for logging? which iβm guessing that the fastai binding should do right?
right, this is a fancy way to say, make sure the actual sub-process is initializing ClearML so all the automagic kicks in, since this is not "forked" but a whole new process, calling Task.current_task is the equivalent of calling Task.init with the same arguments (which you can also do, I'm not sure which one is more straight forward, wdyt?)
Hi FiercePenguin76
It seems it fails detecting the notebook server and thinks this is a "script running".
What is exactly your setup?
docker image ?
jupyter-lab version ?
clearml version?
Also are you getting any warning when calling Task.init ?
Hi @<1523709807092043776:profile|GrittyKangaroo27>
some of my completed datasets,
This only has an effect on the dataset when it is being uploaded, if completed it is there for logging purposes only. What is exactly the use case? (just to be verify, once a Task/Dataset is completed you cannot edit it)
how to make sure it will traverse only current package?
Just making sure there is no bug in the process, if you call Task.init in your entire repo (serve/train) you end up with "installed packages" section that contains all the required pacakges for both use cases ?
I have separate packages for serving and training in a single repo. I donβt want serving requirements to be installed.
Hmm, it cannot "know" which is which, because it doesn't really trace all the import logs (this w...
PricklyRaven28 did you set the iam role support in the conf?
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/docs/clearml.conf#L86
metric=image is the name in the dropdown of the denugimages
Hi GrievingTurkey78
First, I would look at the CLI clearml-data
as a baseline for implementing such a tool:
Docs:
https://github.com/allegroai/clearml/blob/master/docs/datasets.md
Implementation :
https://github.com/allegroai/clearml/blob/master/clearml/cli/data/main.py
Regrading your questions:
(1) No, a new dataset version will only store the diff from the parent (if files are removed it stored the metadata that says the file was removed)
(2) Yes any get operation will downl...
Which clearml
version are you using ?
LOL totally π
I see, so basically fix old links that are now not accessible? If this is the case you might need to manually change the document on the mongodb running in the backend
Logger.current_logger()
Will return the logger for the "main" Task.
The "Main" task is the task of this process, a singleton for the process.
All other instances create Task object. you can have multiple Task objects and log different things to them, but you can only have a single "main" Task (the one created with Task.init).
All the auto-magic stuff is logged automatically to the "main" task.
Make sense ?