Reputation
Badges 1
123 × Eureka!So from our IT guys i now know that
"s3" part of url is subdomain, we use it in all other libs like boto3 and cloudpathlib, never had any problems
This is where the crash happens inside the clearml Task
Adding bucket in clearml.conf causes the same error: clearml.storage - ERROR - Failed uploading: Could not connect to the endpoint URL: " None "
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT...
Specifying it like this, gets me different error:
Exception has occurred: ValueError
- Insufficient permissions (delete failed) for None
botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the DeleteObject operation: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.
During handling of the above exception, another exception occurred:
File "/home/ma...
I cant get the conf credentials to work
Specifying it like this gives me:
Exception has occurred: ValueError
Could not get access credentials for ' None ' , check configuration file ~/clearml.conf
ok, slight update. It seems like artifacts are uploading now to bucket. Maybe my folder explorer used old cache or something.
However, reported images are uploaded to fileserver instead of s3
here is the script im using to test things. Thanks
It looks like im moving forward
Setting url in clearml.conf without "s3" as suggested works (But I dont add port ther, not sure if it breaks something, we dont have a port)
host: " our-host.com "
Then in test_task.py
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
I think connection is created
What im getting now is bucket error, i suppose I have to specify it so...
We dont need a port
"s3" is part of url that is configured on our routers, without it we cannot connect
I have tried:
Airflow - Pain to setup, old UI and other problems
Prefect - Literaly just tried to setup a simple distributed system, took me a week, I do not recommend this tool at all, horrible documentation, noone helps at slack.
Dagster - Absolute beauty, nice UI, easy to setup (as a pip package or just a docker + postgres), i highly recommend this tool. Takes a bit to get used to it. I will in coming week try this combo of dagster + clearml, where i periodically check some things and if...
@<1523701070390366208:profile|CostlyOstrich36> Still unable to understand what im doing wrong.
We have self hosted S3 Ceph storage server
Setting my config like this breaks task.init
good morning, I tried the script you provided and Im getting somewhere
ClearML team should really write up some tutorial about this. I see this question weekly now. The short answer on what we did when we migrated servers was to wite a python script that takes data from clearml mongodb(stores tasks and datasets) and elastic (stores debug image urls, logs, scalars) and migrate them to other clearml instance databases
I purged all docker images and it still doesnt seem right
I see no side panel and it doesnt ask for login name
Here are my clearml versions and elastisearch taking up 50GB
from docker inspect I can see that allegorai/clearml uses:
"CLEARML_SERVER_VERSION=1.11.0",
"CLEARML_SERVER_BUILD=373"
Image hash:ed05631045c4237f59ad48f477e06dd72274ab67e70d2f9adc489431d1ce75d7
I do notice another strange thing
Agent-services is down because It has no API key to clearm
@<1523701087100473344:profile|SuccessfulKoala55> Anything on this?
elastisearch also takes like 15GB of ram
ok, is dataset path stored in mongo?
Im unable to find it in elasticsearch (debug images were here)
ok, I found it.
Are S3 links supported?
@<1523701070390366208:profile|CostlyOstrich36> It it still needed since Eugene thinks there is a bug?
When I look at LinkEntry object, link property is correct, no duplicates. Its relative_path thats duped and also key name in _dataset_link_entries