I suspect it failed to create one on the host and then mount into the docker
Hi @<1523702969063706624:profile|PoisedShark13>
However, INSTALLED PACKAGES of my task is misses many of installed packages (any idea why?)
It automatically detects the directly imported packages, literally analyzing your code base and looking for imports
The derivative packages (i.e. the one that any of the "main" packages need, will be listed after the first time the agent installs everything)
If something specific is missing, you can manually add it with:
Task.add_requiremen...
Hi FiercePenguin76
By default clearml
will list only the packages you import, and not derivative packages.
This means that if you import package X
and it imports package Y
, only package X
will be listed.
The way it should work is by statically analyzing the entire repository, but if you import a local package from a different local folder, and that folder is Not in the same repo, it will not get listed (obviously if you install the external local package, it will be...
Or you want to generate it from a previously executed run?
if in the "installed packages" I have all the packages installed from the requirements.txt than I guess I can clone it and use "installed packages"
After the agent finished installing the "requirements.txt" it will put back the entire "pip freeze" into the "installed packages", this means that later we will be able to fully reproduce the working environment, even if packages change (which will eventually happen as we cannot expect everyone to constantly freeze versions)
My problem...
Hi UnsightlyHorse88
Hmm, try adding to your clearml.conf file:agent.cpu_only = true
if that does not work try adding to the OS environmentexport CLEARML_CPU_ONLY=1
@<1720249421582569472:profile|NonchalantSeaanemone34>
dso = Dataset.create(
dataset_project= project_name,
dataset_name= dataset_name,
parent_datasets=[parent_datasets_id],
)
dso = Dataset.get(
dataset_project= project_name,
dataset_name= dataset_name,
only_completed=True,
only_published=False,
alias='latest',
)
why are you creating a dataset then getting a dataset on the same object?
it seems you are trying to upload...
web-server seems okay, could you send the logs from the api-server?
Also if you can, the console logs from your browser, when you get the blank screen. Thanks.
Hi DeliciousBluewhale87 ,
Yes they do (I think it's ClearML Enterprise or Allegro ClearML). I also know it has extended capabilities in data management , permissions , and security.
More than that you should probably talk to them directly ( https://clear.ml/contact-us/ ) 🙂
Hi ReassuredOwl55
The easiest is to configure it as default output_uri in the clearml.conf of file the agent, wdyt?
https://github.com/allegroai/clearml-agent/blob/ebb955187dea384f574a52d059c02e16a49aeead/docs/clearml.conf#L430
Hi @<1610083503607648256:profile|DiminutiveToad80>
I think we will need more context for the log...
but I think there is something wrong with the GCP resource configuration of your autoscaler
Can you send the full autoscaler log and the configuration ?
No. since you are using Pool. there is no need to call task init again. Just call it once before you create the Pool, then when you want to use it, just do task = Task.current_task()
Also, I just wanted to say thanks for the tool! I'm managing a small data science practice and it's going to be really nice to have a view of all of the experiments we've got and know our GPU utilization, all without having to give every data scientist access to each box where the workflows are run. Incredibly stoked.
♥ ❤ ♥
HurtWoodpecker30
The agent uses the
requirements.txt
)
what do you mean by that? aren't the package listed in the "Installed packages" section of the Task?
(or is it empty when starting, i.e. it uses the requirements.txt from the github, and then the agent lists them back into the Task)
IntriguedRat44 could I ask you to open a GitHub issue on it?
I really do not want it to slip through our fingers...
(BTW: meanwhile I was not able to reproduce it, what's the OS / nvidia drivers you are using )?
Hi @<1566959357147484160:profile|LazyCat94>
So it seems the arg parser is detecting the configuration YAML
The first thing I would suggest is changing it to a relative path (so that when launched on remote machines it will find the YAML file)
Regardless how are you launching the HPO ? are you spinning a new agent ?
(as background, argparser arguments are injected in realtime by the agent or the HPO when running as subprocesses)
GentleSwallow91 notice that on the Task you have "Installed Packages" this is the equivalent of requirments.txt , you can edit it and add a missing package, or programatically add it in code (though usually directly imported packages are automatically registered, how come this one is missing?)
to add a package in code:Task.add_requirements(package_name="my_package", package_version=">=1") task = Task.init(...)
base docker image but clearML has not determined it during the script ru...
Hi SubstantialElk6
I'm not sure what you are asking 🙂
Basically the clearml-agent
will pull a Task from an execution queue, and execute it (based on the definition on the Task, i.e. git repo, python packages docker image etc.)
Hi SubstantialElk6
We try to push a fix the same day a HIGH CVE is reported, that said since the external API interface is relatively far away from DBs / OS, and since as a rule of thumb, authorized users are trusted (basically inherit agent code execution means they have to be), it is an exception to have a CVE that affects the system. I think even this high profile one, does not actually have an effect on the system as even if ELK is susceptible (which it is not), only authorized users co...
If I checkout/download dataset D on a new machine, it will have to download/extract 15GB worth of data instead of 3GB, right? At least I cannot imagine how you would extract the 3GB of individual files out of zip archives on S3.
Yes, I'm not sure there is an interface to extract only partial files from the zip (although worth checking).
I also remember there is a GitHub issue with uploading 50GB dataset, and the bottom line is, we should support setting chuck size, so that we can uploa...
report_text does not, this is very weird
Okay this seems to be the issue.
Just making sure the Task status is "running" and task.get_logger().report_text("something")
does not report a thing ?
Do you see it on your screen?
Can you test without the "Task.debug_simulate_remote_task / init" ?
FlatStarfish45
In the parent task, the libs appear installed.
What do you mean by "parent Task"? Is this the base task we are optimizing (i.e. the experiment / model we are optimizing) ?
Or is it the "Optimization Task" itself?
which is probably why it does not work for me, right?
Correct, you need to pass the entire configuration (it is stored as a blob, as opposed to the hyperparameters that are stored as individual values)
` :param configuration_overrides: Optional, override Task configuration objects.
Expected dictionary of configuration object name and configuration object content.
Examples:
{'General': dict(key='value')}
{'General': 'config...
if I run my own ClearML self-hosted server?
Then you have everything on your end, it will not communicate with the saas offering. meaning no limits what so ever.
(That said some of the cloud auto-scaling and compute features are not part of the open source)
What's the OS running the server?
CourageousLizard33 Are you using the docker-compose to setup the trains-server?
BattyLion34
if I simply clone nntraining stage and run it in default queue - everything goes fine.
When you compare the Task you clone manually and the Task created by the pipeline , what's the difference ?
It should be autodetected, and listed in the installed packages with something like:keras-contrib @git+https://www.github.com/keras-team/keras-contrib.git
Is this what you are seeing?
If not you can add it manually with:Task.add_requirements('git+
') Task.init(...)
Notice to call before Task.init