Reputation
Badges 1
662 × Eureka!That is, we have something like:
` task = Task.init(...)
ds = Dataset.create(dataset_name=task.name, dataset_project=task.get_project_name(), use_current_task=True)
upload files
dataset.upload(show_progress=True)
dataset.finalize()
do stuff with task and dataset
task.close() `But because the dataset is linked to the task, the task is then moved and effectively becomes invisible 😕
Any thoughts AgitatedDove14 SuccessfulKoala55 ?
When is the next release expected? 😄
The logs are on the bucket, yes.
The default file server is also set to s3://ip:9000/clearml
I was thinking of using the --volume
settings in clearml.conf
to mount the relevant directories for each user (so it's somewhat customizable). Would that work?
It would be amazing if one can specify specific local dependencies for remote execution, and those would be uploaded to the file server and downloaded before the code starts executing
One must then ask, of course, what to do if e.g. a text refers to a dictionary configuration object? 🤔
Right and then for text (file path) use some regex or similar for extraction, and for dictionary simply parse the values?
Answering myself for future interested users (at least GrumpySeaurchin29 I think you were interested):
You can "hide" (explained below) secrets directly in the agent 😁 :
When you start the agent listening to a specific queue (i.e. the services worker), you can specify additional environment variables by prefixing them to the execution, i.e. FOO='bar' clearml-agent daemon ....
Modify the example AWS autoscaler script - after the driver = AWSDriver.from_config(conf)
, inject ...
CostlyOstrich36 I'm not sure what you mean by "through the apps", but any script AFAICS would expose the values of these environment variables; or what am I missing?
True, and we plan to migrate to pipelines once we have some time for it :) but anyway that condition is flawed I believe
So now we need to pass Task.init(deferred_init=0)
because the default Task.init(deferred_init=False)
is wrong
That's a nice work around of course - I'm sure it works and I'll give it a shot momentarily. I'm just wondering if ClearML could automatically recognize image files in upload_artifact
(and other well known suffixes) and do that for me.
Actually TimelyPenguin76 I get only the following as a "preview" -- I thought the preview for an image would be... the image itself..?
Thanks David! I appreciate that, it would be very nice to have a consistent pattern in this!
Note that it would succeed if e.g. run with pytest -s
SmugDolphin23 I think you can simply change not (type(deferred_init) == int and deferred_init == 0)
to deferred_init is True
?
I'll see if we can do that still (as the queue name suggests, this was a POC, so I'm trying to fix things before they give up 😛 ).
Any other thoughts? The original thread https://clearml.slack.com/archives/CTK20V944/p1641490355015400 suggests this PR solved the issue
Something like this, SuccessfulKoala55 ?
Open a bash session on the docker ( docker exec -it <docker id> /bin/bash
) Open a mongo shell ( mongo
) Switch to backend db ( use backend
) Get relevant project IDs ( db.project.find({"name": "ClearML Examples"})
and db.project.find({"name": "ClearML - Nvidia Framework Examples/Clara"})
) Remove relevant tasks ( db.task.remove({"project": "<project_id>"})
) Remove project IDs ( db.project.remove({"name": ...})
)
UPDATE: Apparently the quotation type matters for furl
? I switched the '
to \"
and it seems to work now
Holy crap this was a light-bulb moment, is this listed somewhere in the docs?
It solves so much of my issues xD
and I don't think it's in the docs - we'll add that
Very welcome update, please use some highlighting for it too, it's so important for a complete understanding of how the remote execution works
Exactly; the cloud instances (that are run with clearml-agent
) should have that clearml.conf
+ any changes specified in extra_clearml_configuration
for the scaler
I guess it does not do so for all settings, but only those that come from Session()
Right, but that's as defined in the services agent, which is not immediately transparent
Let me know if you do; would be nice to have control over that 😁
The idea is that the features would be copied/accessed by the server, so we can transition slowly and not use the available storage manager for data monitoring
Or some users that update their poetry.lock
and some that update manually as they prefer to resolve on their own.
Well you can install the binary in the additional start up commands.
Matter of fact, you can just include the ECR login in the "startup steps" offered by the scaler, so no need for this repository. I was thinking these are local instances.
Kinda, yes, and this has changed with 1.8.1.
The thing is that afaik currently ClearML does not officially support a remotely executed task to spawn more tasks, so we also have a small hack that marks the remote "master process" as a local task prior to anything else.
Coming back to this; ClearML prints a lot of error messages in local tests, supposedly because the output streams are not directly available:
` --- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown...