Reputation
Badges 1
606 × Eureka!If you think the explanation takes too much time, no worries! I do not want to waste your time on my confusion 😄
Maybe something like this is how it is intended to be used?
` # run_with_clearml.py
def get_main_task():
task = Task.create(project="my_project", name="my_experiment", script="main_script.py")
return task
def run_standalone(task_factory):
Task.enqueue(task_factory())
def run_in_pipeline(task_factory):
pipe = Pipelinecontroller()
pipe.add_step(preprocess, ...)
pipe.add_step(base_task_factory=task_factory, ...)
pipe.add_step(postprocess, ...)
pipe.start()
if...
No idea what's happening there.
Obviously in my examples there is a lot of stuff missing. I just want to show, that the user should be able to replicate Task.init
easily so it can be configured in every way, but still can make use of the magic that clearml has, for stuff that does not differ from the comfort way.
Yea, but doesn't this feature make sense on a task level? If I remember correctly, some dependencies will sometimes require different pip versions. And dependencies are on task basis.
I have an carla.egg
file on my local machine and on the worker that I include with sys.path.append
before I can do
import carla
. It is the same procedure on my local machine and on the clearml-agent worker.
Seems like some experiments cannot be deleted
In the first run the package only existed because it is preinstalled in the docker image. Afaik, in the second run it is also preinstalled, but pip will first try to resolve it and then see whether it already exists. But I am not to sure about this.
I created an github issue because the problem with the slow deletion still exists. https://github.com/allegroai/clearml/issues/586#issue-1142916619
Thank you. Will try that!
But you can manually add them with Task.add_requirements, no?
In my opinion an ugly solution. I would have to keep track of which requirements are missing. Then I would rather just add all requirements manually.
It seems like this is a bug however or is something like this to be expected? There shouldn't be files that are not shown in the WebUI..?
[root@dc01deffca35 elasticsearch]# curl
`
{
"cluster_name" : "clearml",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 10,
"active_shards" : 10,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 10,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_nu...
But this seems like something that is not related to clearml 🙂 Anyways, thanks again for the explanations!
I think sometimes there can be dependencies that require a newer pip version or something like that. I am not sure though. Why can we even change the pip version in the clearml.conf?
SuccessfulKoala55 So what happens is, that always when/after the cleanup_service runs, clearml will throw these kind of errors
My code is in classes, indeed. But I have more than one model. Actually, all the things that people store in for example yaml
or json
configs I store in python
files. And I do not want to statically import all the models/configs.
Thank you very much for the fast work!
One last question: Is it possible to set the pip_version task-dependent?
Mhhm, then maybe it is not clear 😂 to me how clearml.Task is meant to be used. I thought of it as being a container for all the information regarding a single experiment that is reflected on the server-side and by this in the WebUI. Now I init() a Task and it will show in the WebUI. I thought after initialization I can still update the task to my liking, i.e. it being a documentation of my experiment.
Seems to happen only while the cleanup_service is running!
[2021-05-07 10:53:00,566] [9] [WARNING] [elasticsearch] POST
` [status:N/A request:60.061s]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib64/python3.6/http/client.py", lin...