Reputation
Badges 1
662 × Eureka!yes, a lot of moving pieces here as we're trying to migrate to AWS and set up autoscaler and more π
I will! (once our infra guy comes back from holiday and updates the install, for some reason they setup server 1.1.1???)
Meanwhile wondering where I got a random worker from
AgitatedDove14 I will try! I remember there were some issues with it, where I had to resort to this method first, but maybe things have changed since :)
Hm, just a small update - I just verified and it does indeed work on linux:
` import clearml
import dotenv
if name == "main":
dotenv.load_dotenv()
config = clearml.backend_api.Config.load() # Success, parsed with environment variables `
Some examples of the mess it creates (also posted in the main channel):
A single project now has multiple subprojects The subprojects have the .datasets
hidden subproject (with really frustrating project names) The subprojects are empty To access the original project, I have to go twice into the same project because of these hidden projects Because of these hidden subprojects, I cannot delete a project that has 0 experiments
Sounds like incorrect parsing on ClearML side then, doesn't it? At least, it does not fully support MinIO then
I don't imagine AWS users get a new folder named aws-key-region-xyz-bucket-hostname
when they download_folder(...)
from an AWS S3 bucket, or do they? π€
Can I query where the worker is running (IP)?
That's probably in the newer ClearML server pages then, I'll have to wait still π
Thanks SuccessfulKoala55 ! Is this listed anywhere in the documentation?
Could I set an environment variable there and then refer to it internally in the config with the ${...}
notation?
I see https://github.com/allegroai/clearml-agent/blob/d2f3614ab06be763ca145bd6e4ba50d4799a1bb2/clearml_agent/backend_config/utils.py#L23 but not where it's called π€
Dynamic pipelines in a notebook, so I donβt have to recreate a pipeline every time a step is changed π€
Hm, this didn't happen until now; I'd be happy to try again with a new version, but something with 1.4.0 broke our StorageManager, so we reverted to 1.3.2
Does that make sense SmugDolphin23 ?
Yes, that one shows up. I forgot to mention we also set the version explicitly, but that just creates a duplicate dataset under Datasets
and anyway our main Task
is now hidden from the original project.
So project project
exists, but it is empty.
I'm not too worried about the dataset appearing (or not) in the Datasets
tab. I would like it (the original task ) to to not disappear from the original project I assigned it to
After setting the sdk.development.default_output_uri
in the configs, my code kinda looks like:
` task = Task.init(project_name=..., task_name=..., tags=...)
logger = task.get_logger()
report with logger freely `
Trying now with 1.4.1, but I believe the changes you're referring to SuccessfulKoala55 were also introduced in 1.4.0, right?
We just redeployed to use the 1.1.4 version as Jake suggested, so the logs are gone π
Seems like you're missing an image definition (AMI or otherwise)
@<1539780258050347008:profile|CheerfulKoala77> you may also need to define subnet or security groups.
Personally I do not see the point in Docker over EC2 instances for CPU instances (virtualization on top of virtualization).
Finally, just to make sure, you only ever need one autoscaler. You can monitor multiple queues with multiple instance types with one autoscaler.
It also happens when use_current_task=False
though. So the current best approach would be to not combine the task and the dataset?
Here's an example where poetry.lock
is removed, and still the console reads:url:
.... branch: HEAD commit: 22fffaf8d5f377b7f10140e642a7f6f26b72ffaa root: /.../.clearml/venvs-builds/3.10/task_repository/... Applying uncommitted changes Poetry Enabled: Ignoring requested python packages, using repository poetry lock file! Creating virtualenv ds-platform in /.../.clearml/venvs-builds/3.10/task_repository/.../.venv Updating dependencies Resolving dependencies...
I also tried setting agent.python_binary: "/usr/bin/python3.8"
but it still uses Python 2.7?
Any leads TimelyPenguin76 ? I've also tried setting up a minio s3 bucket, but I'm not sure if the remote agent has copied the credentials and host π€
Not really - it will just show the string. A preview would be more like a low-res version of the uploaded image or similar.
Alternatively, it would be good to specify both some requirements and auto-detect π€
A follow up question (instead of opening a new thread), is there a way I could signal some files/directories to be copied to the execute_remotely
task?