Reputation
Badges 1
662 × Eureka!Generally the StorageManager seems a bit slow, even a simple StorageManager.list(...)
on a local path seems to take a long time
Exactly; the cloud instances (that are run with clearml-agent
) should have that clearml.conf
+ any changes specified in extra_clearml_configuration
for the scaler
I think the environment variables path might work for you then?
You'd set your config withuse_credentials_chain: ${CREDENTIALS_CHAIN}
Then in Python you could os.environ['CREDENTIALS_CHAIN'] = True/False
before you make any calls to ClearML?
(the extra_vm_bash_script
is what you're after)
Let me test it out real quick.
So basically what I'm looking for and what I have now is something like the following:
(Local) I have a well-defined aws_autoscaler.yaml
that is used to run the AWS autoscaler. That same autoscaler is also run with CLEARML_CONFIG_FILE=....
(Remotely) The autoscaler launches, listens to the predefined queue, and is able to launch instances as needed. I would run a remote execution task object that's appended to the autoscaler queue. The autoscaler picks it up, launches a new instanc...
Maybe. When the container spins, are there any identifiers regarding the task etc available? I create a folder on the bucket per python train.py
so that the environment variables files doesn't get overwritten if two users execute almost-simultaneously
Actually, it appears some elements (scalars, plots, etc) have no migrated by moving mongodb data.
Where are these stored? Any idea @<1523701827080556544:profile|JuicyFox94> ?
No it does not show up. The instance spins up and then does nothing.
I see that the GUI AutoScaler is only in the paid version, wonder why the GCP driver is not open source?
Not sure if ClearML has any built in support, but we used the above for a similar issue but with Prefect2 :)
Yes exactly that AgitatedDove14
Testing our logic maps correctly, etc for everything related to ClearML
Or some users that update their poetry.lock
and some that update manually as they prefer to resolve on their own.
Parquet file in this instance (used to be CSV, but that was even larger as everything is stored as a string...)
This also appears in the error log:
` StorageManager.download_folder(cache_dir.as_posix(), local_folder=".")
File "/home/idan/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/storage/manager.py", line 278, in download_folder
for path in helper.list(prefix=remote_url):
File "/home/idan/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/storage/helper.py", line 596, in list
res = self._driver.list_container_objects(self._container, ex_prefix=prefix)
Fi...
You can use logger.report_scalar
and pass a single value.
I'll have some reports tomorrow I hope TimelyPenguin76 SuccessfulKoala55 !
We're still working these quirks out. But one issue after we changed the AMI is that the VPC (SubnetId?) was missing from the instance so it could not reach the ClearML API server.
I think maybe the autoscaler service is missing some additional settings...
I thought this follows from our previous discussion SuccessfulKoala55 , where this is a built-in feature of pyhocon?
Running a self-hosted server indeed. It's part of a code that simply adds or uploads an artifact 🤔
We have a mini default config (if you remember from a previous discussion we had) that actually uses the second form you suggested.
I wrote a small "fixup" script that combines this default with the one generated by clearml-init
, and it simply does:def_config = ConfigFactory.parse_file(DEF_CLEARML_CONF, resolve=False) new_config = ConfigFactory.parse_file(new_config_file, resolve=False) updated_new_config = ConfigTree.merge_configs(new_config, def_config)
CostlyOstrich36 so internal references are not resolved somehow? Or, how should one achieve:
def my_step(): from ..utils import foo foo("bar")
Opened this - https://github.com/allegroai/clearml/issues/530 let me know if it's not clear enough FrothyDog40 !