Reputation
Badges 1
611 × Eureka!I guess the supported storage mediums (e.g. S3, ceph, etc...) dont have this issue, right?
Is sdk.development.default_output_uri used with s3://ip:9000/clearml or ip:9000/clearml ?
Ah, thanks a lot. So for example the CleanUp Service ( https://github.com/allegroai/clearml/blob/master/examples/services/cleanup/cleanup_service.py ) should have no troubles deleting the artifacts.
Mhhm, then maybe it is not clear 😂 to me how clearml.Task is meant to be used. I thought of it as being a container for all the information regarding a single experiment that is reflected on the server-side and by this in the WebUI. Now I init() a Task and it will show in the WebUI. I thought after initialization I can still update the task to my liking, i.e. it being a documentation of my experiment.
No problem in my case at least.
Ah, okay, that's weird 🙂 Thank you for answering!
If I understood correctly, if I tried to print(os.environ["MUJOCO_GL"]) after the clearml Task is created, this should be set?
Any idea why deletion of artifacts on my second fileserver does not work?
fileserver_datasets: networks: - backend - frontend command: - fileserver container_name: clearml-fileserver-datasets image: allegroai/clearml:latest restart: unless-stopped volumes: - /opt/clearml/logs:/var/log/clearml - /opt/clearml/data/fileserver-datasets:/mnt/fileserver - /opt/clearml/config:/opt/clearml/config ports: - "8082:8081"
ClearML successfu...
From the logs when ran with --foreground I I do not see any conda create command.
I just realized that I forgot again that I am using importlib and this is probably why everythings weird. I tried to reproduce the error was a smaller project and was not able to get the error again. Sorry for having wasted your time! 😐
It seems like the services-docker is always started with Ubuntu 18.04, even when I usetask.set_base_docker( "continuumio/miniconda:latest -v /opt/clearml/data/fileserver/:{}".format( file_server_mount ) )
In my case I use the conda freeze option and do not even have CUDA installed on the agents.
Yea, but could also be for other reasons. I ll try to find out somehow.
Is ther a way to see the contents of /tmp/conda_envaz1ne897.yml ? Seems to be deleted after the task is finihsed
Latest version for everything. I will message you again, if I encounter this problem again.
That I understand. But I think (old) pip versions will sometimes not resolve a package. Probably not the case the other way around.
Thanks a lot, now I think I understand.
Debug samples can only be controlled via api.file_server (or programatically)
Could you guide me how to approach this programmatically? Can I implement my own storage adapter for debug samples with ClearML interfaces or am I on my own?
Then I could also do this:# My custom very special use case task = Task() task = task.load_statedict(await Task.load_or_create(task_name)) await task.synchronize() await run_code_analysis() task.add_requirement("myreq") await task.synchronize()
I use fixed users!
Is there a way to specify this on a per task basis? I am running clearml-agent in docker mode btw.
Don't know whether I do something wrong. Locally it works, but when executed via queue I get:
` File "run_task.py", line 14, in <module>
main()
File "run_task.py", line 9, in main
printme = importlib.import_module("some_package.file_to_import").printme
File "/home/tim/.clearml/venvs-builds/3.7/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd...
I am still not getting why it is a problem to just update the requirements at any time... 😕
No no, I was just wondering how much effort it is to create something like ClearML. And your answer gives me a rough estimate 🙂
Thanks for the answer. So currently the cleanup is done based number of experiments that are cached? If I have a few big experiments, this could make my agents cache overflow?
[root@dc01deffca35 elasticsearch]# curl `
{
"cluster_name" : "clearml",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 10,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 10,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_nu...