Reputation
Badges 1
25 × Eureka!is it possible to change an existing model's URL?
Edit the DBs ... That's basically the only way π
That depends on the HPO algorithm, basically the will be pushed based on the limit of "concurrent jobs", so you do not end up exploding the queue. It also might be a Bayesian process, i.e. based on previous set of parameters and runs, like how hyper-band works (optuna/hpbandster)
Make sense ?
Is there a way to do this using ssh keys?
the .ssh of the host machine should be automatically mounted, you can force it by setting force_git_ssh_protocol: true
None
It is still not working for me. Are you using Linux, windows or macos?
should work for linux mac and windows, what are you using ?
Can't figure out what made it get to this point
I "think" this has something to do with loading the configuration and setting up the "StorageManager".
(in other words setting the google.storage)... Or maybe it is the lack of google storage package?!
Let me check
Hi ReassuredTiger98
Agent's queue priory can be translated to the order the agent will pull jobs from.
Now let's assume we have two agents with priorities A,B for one and B,A for the other. If we only push a Task to queue A, and both agents are idle (implying queue B is empty), there is no guarantee which one will pull the job.
Does that make sense ?
What is the use-case you are trying to solve/optimize for ?
Hi NastyFox63
This seems like most of the reports are converted to pngs (which is what the automagic does if it fails to convert the matplotlib into interactive plot).
no more than 114 plots are shown in the plots tab.
Are you saying we have 114 limit on plots ?
Is this true for "full screen" mode (i.e. not in the experiments table but switch to full detailed view)
Hi @<1554275802437128192:profile|CumbersomeBee33>
what do you mean by "will the dependencies will be removed or not" ?
The next time the agent spin a new Task it will create a new venv and delete the previous one
Hmm, I still wonder what is the "correct" answer for most people, is empty string in argparse redundant anyhow? will someone ever use it?
Makes total sense!
Interesting, you are defining the sub-component inside the function, I like that, this makes the code closer to how this is executed!
training loop is within line 469, I think.
I think the model state is just post training loop (not inside the loop), no?
Yes JitteryCoyote63 I think you are correct, this currently the easiest to do. PompousParrot44 notice that you should have a "services" queue with a trains-agent "services mode" running to enqueue those type pf mostly sleeping services π
I was thinking we can quickly create a service that does that, maybe leverage one of these ?
https://github.com/mehrdadmhd/scheduler-py
https://github.com/dbader/schedule
WDYT?
Hi JitteryRaven85
I have also deleted some hyper-params but they appear again when training starts.
Yes you cannot "delete" parameters, as any missing parameter is synced back (making sure you have a full log).
The problem is that when I clone an experiment and change the hyper params some change and some remain the same
Could you expand on which parameters stay the same ? (obviously this should not happen)
Well it seems we forgot that one π I'll quickly make sure it is there.
As a quick solution (no need to upgrade)task.models["output"]._models.keys()
It has to be alive so all the "child nodes" could report to it....
GreasyPenguin14 let me check with the guys when is the next version .
Are you using the self-hosted server of the community server ?
thanks MagnificentSeaurchin79 , yes that makes it clear.
If that is the case, I think building a container is the easiest solution π
(BTW: You could also build a wheel, if you have setup.py then running is once bdist_wheel will build a wheel, and then install the wheel)
But I am starting to wonder whether It would be easier just changing sys,path on the scripts that use the sibling libs.
that depends, how would the sibling packages get to a remote machine ?
Yes actually that might be it. Here is how it works,
It launch a thread in the background to do all the analysis of the repository, extracting all the packages.
If the process ends (for any reason), it will give the background thread 10 seconds to finish and then it will give up. If the repository is big, the analysis can take longer, and it will quit
how to make sure it will traverse only current package?
Just making sure there is no bug in the process, if you call Task.init in your entire repo (serve/train) you end up with "installed packages" section that contains all the required pacakges for both use cases ?
I have separate packages for serving and training in a single repo. I donβt want serving requirements to be installed.
Hmm, it cannot "know" which is which, because it doesn't really trace all the import logs (this w...
DepressedChimpanzee34
I am actually curious now, why is the default like this? maybe more people are facing similar bottlenecks?
On "regular" load there is no need for multiple processes, and the memory consumption might be more important than reply lag (at least before you start to scale)
DisturbedWalrus17
By spawning multiple processes for the API server, it looks like we utilise the CPU more now but the UI and API calls are still lagging a lot
Can you try with even more ...
Okay found the issue, to disable SSL verification global add the following env variable:CLEARML_API_HOST_VERIFY_CERT=0(I will make sure we fix the actual issue with the config file)
Can you please tell me how to return the folder where the script should run?
add it to the python path
PYTHONPATH="/src/project"
I do not think this is the upload timeout, it makes no sense to me for GCP package (we do not pass any timeout, it's their internal default for the argument) to include a 60sec timeout for upload...
I'm also not sure where is the origin of the timeout (I'm assuming the initial GCP handshake connection could not actually timeout, as the response should be relatively quick, so 60sec is more than enough)
Could not find a version that satisfies the requirement open3d==0.15.2 .. from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
This points to the agent installing using a different python version that you run the original code, I would guess python3.6
MuddySquid7 I might have found something, and this is very very odd, it seems it will Not upload any new images post the history size, which is very odd considering the number of users actively using this feature...
Do you want to try a hack to see if it solved your issue ?
However, regarding your recommendation of using
StorageManager
class to delete the URL, it seems that this class only contains methods for checking existence of files, downloading files and uploading files, but
no method
for actually
deleting
files based on their URL (see doc
and
).
Yes you are correct π you should use a "deeper" class:
helper = StorageHelper.get(remote_url)
helper.delete(remo...
Hi GiganticTurtle0
You can keep clearml following the dictionary auto updating the UI
args = task.connect(args)
Let me see if I can reproduce something