Reputation
Badges 1
25 × Eureka!DeliciousBluewhale87
node.base_task_id
Β is the base task, which will always be in draft mode, Instead we should use theΒ
node.executed
Β which references the current executed node.
YES, maybe we should add that into the example, so it is clearer ? WDYT?
Just to make sure I understand, running locally creates the Args/command correctly, then when actually executed on the remote machine (i.e. execute_remotely creates the correct Args/command But when the agent actually executes it) it updates back the Args/command as a list. Is that a correct description ?
This is what I think you should end up withDiscreteParameterRange('General/dataset_url', values=["option 1 for url", "option 2 for url"])If args['dataset_url'] is a list, you should just do values=args['dataset_url']
In Windows settingΒ
system_site_packages
Β toΒ
true
Β allowed all stages in pipeline to start - but doesn't work in Lunux.
Notice that it will inherit from the system packages not the venv the agent is installed in
I've deleted tfrecords from master branch and commit the removal, and set the folder for tfrecords to be ignored in .gitignore. Trying to find, which changes are considered to be uncommited.
you can run git diff it is essentially...
Hi DepressedChimpanzee34
This is not a query call, this is a reporting call. see docs below
https://clear.ml/docs/latest/docs/references/api/workers#post-workersstatus_report
It is used by the worker to report its own status.
I think this is what you are looking for:
https://clear.ml/docs/latest/docs/references/api/workers#post-workersget_stats
Hi ExuberantBat52
I do not think you can... i would use aws secret manager to push the entire user list config file wdyt?
UnsightlyShark53 Awesome, the RC is still not available on pip, but we should have it in a few days.
I'll keep you posted here :)
cannot schedule new futures after interpreter shutdown
This implies the process is shutting down.
Where are you uploading the model? What is the clearml version you are using ? can you check with the latest version (1.10) ?
When I passed specific arguments (for example --steps) it ignored them...
script.py test blah1 blah2 blah3 42
Is this how it is intended to be used ?
or creating a dedicated function I would suggest also including the actual sampled point in the HP space.
Could you expand ?
This would be the most common use case, and essentially the reason for running the HPO understanding the sensitivity of metrics with respect to hyper-parameters
Does this relates to:
https://github.com/allegroai/clearml/issues/430
manually" filtering the keys I've put in for the HP space. I find it a bit strange that they are not saved as part of t...
JitteryCoyote63 what's the clearml version ?
Are you always seeing the "model uploaded completed" message ?
What's the python version you are using?
could it be the polling on the Task (can't remember whats the interval), but it will update it's state once every X minutes/seconds
Nice, that seems to be the issue. Any chance you can open a GitHub issue, so we do not loose track of it ?
Hi MelancholyElk85
Can I manually deleteΒ
.zip
Β files with datasets inΒ
.clearml/cache/storage_manager/datasets
Β directory?
Yes, you can. I "think" the .zip is stored for easier access, but you can delete it, as long as the "extracted" folder exists, it should be fine.
HandsomeCrow5 OMG the guys already added it to the debug samples as well, checkout the demo app (drop down "test html sample"):
https://demoapp.trains.allegro.ai/projects/4e7fef090aa849b1acc37d92b59b3360/experiments/83c9ed509f0e421eaadc1ef56b3af5b4/info-output/debugImages
can you bump me to that thread?
https://clearml.slack.com/archives/CTK20V944/p1630610430171200
I realise I'll need to catalogue all the dataset ids created by ppl separately on a spreadsheet.
Okay this part I missed, why would you need to add additional "catalog" when you have the UI?
Hi DeliciousBluewhale87
Yes that should have worked, can you verify the task status ?
Print(Task.get_task(...).get_status())
Hmm, #790 should be solved in 1.7.2
Yes, I always see the "model uploaded completed" for such stuck tasksAny chance this is reproducible ?
How many processes do you see running (i.e. ps -Af | grep python) ?
What is the training framework? is it multiprocess ? how are you launching the process itself? is it Linux OS? is it running inside a specific container ?
I get the same "white" image in both TB & ClearML π
Hi TrickySheep9
Hmm I think you are correct, exit remotely will not work inside a jupyter notebook because it will not be able to close it.
I was just revising workflows that might be similar, wdyt?
https://clearml.slack.com/archives/CTK20V944/p1620506210463400?thread_ts=1614234125.066600&cid=CTK20V944
Hi ReassuredTiger98
It's clearml that needs to support subparser, and it does support it.
What are you seeing in the Args section ?
(Notice that at the end all the args parsing are stored on the global "args" variable after you call the pasre_args(), clearml will basically take those variables and put them into Args section)
I don't see any requests
This points to configuration, specifically maybe it is directed to a different server?!
UnevenDolphin73 if you have the time to help fix / make it work it will be greatly appreciated π
BTW: you will be loosing the comments π
Is there any progress made on the clearml-serving repo?
Hi JitteryCoyote63
yes, things are progressing slower than expected, I'm expecting actual work will be pushed in early Jan. On the bright side we are trying to work closely with TorchServing team and Nvidia Triton to expand capabilities.
Currently it seems the setup will be "proxy server container" for per-post processing, then serving engine container (Triton/Torch), with monitoring container as control plan (i.e. collecting s...
Hi PlainSquid19
Did you check the website https://allegro.ai ?
If you need more info I would just fill-in the contact info, I'm sure the sales guys will get back to you soon π
Correct π
You can spin it in two modes, either venv or docker (notice that even in docker mode, it will still clone the code into the docker and install the packages inside the docker, but it also inherits from the docker preinstalled system packages, so that the installation process is a lot faster, but you have the ability to change packages without having to build an entire new docker image)
oh that makes sense.
I would add to your Task's docker startup script the following:
ls -la /.ssh
ls -la ~/.ssh
cat ~/.ssh/id_rsa
Let's see what you get