
Reputation
Badges 1
123 × Eureka!we use Ceph Storage Cluster, interface to it is the same as S3
I dont get what I have misconfigured.
The only thing I have not added is "region" field in clearml.conf because we literally dont have, its a self hosted cluster.
You can try and replicate this s3 config I have posted earlier.
Is is even known if the bug is fixed on that version?
WebApp: 1.14.1-451 • Server: 1.14.1-451 • API: 2.28
How can I do that?
I need to save the original hash, otherwise I lose all trackability to about 2k experiments
Yes, credetials seems to work
Im trying to figure out not why I dont see the uploaded files / folders
- I checked maybe clearml task uses fileserver instead but i dont see any files in fileserver folder
- Nothing is uploaded in bucket (i will ask IT guy to check if im uploading any files in logs)
@<1523701070390366208:profile|CostlyOstrich36> Any news on this? We are currently stuck without this fix, cant finish up clearml setup
The problem is that clearml.conf s3 config doesnt support empty region field, even empty strings crashes it
I purged all docker images and it still doesnt seem right
I see no side panel and it doesnt ask for login name
also, when uploading artifacts, I see where they are stored on the s3 bucket, but I cant find where the debug images are stored at
I tried it with port, but still having the same issue
Tried it with/without secure and multipart
We dont need a port
"s3" is part of url that is configured on our routers, without it we cannot connect
What you want is to have a service script that cleans up archived tasks, here is what we used: None
It looks like im moving forward
Setting url in clearml.conf without "s3" as suggested works (But I dont add port ther, not sure if it breaks something, we dont have a port)
host: " our-host.com "
Then in test_task.py
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
I think connection is created
What im getting now is bucket error, i suppose I have to specify it so...
ok, then, I have a solution, but it still makes duplicate names
- new_dataset._dataset_link_entries = {} # Cleaning all raw/a.png files
- resize a.png and save it in another location named a_resized.png
- Add back other files i need (excluding raw/a.png), I add them to new_dataset._ dataset_link_entries
- Use add_external_files to include it in dataset. Im also using dataset_path=[a list of relative paths]
What I would expect:
100 Files removed (all a.png)
100 Files added (all a_resized.png)
...
Hey, i see that 1.14.2 dropped
I tried it but the issue is still there, maybe the hotfix is in next patch?
Here is the setup so you can reproduce it (we dont have region field)
clearml.conf:s3 {
use_credentials_chain: false
credentials: [
{
host: "
s3.somehost.com "
key: "XXXXXXXXXXXXXXXXXXXX"
` secret: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
I solved the problem.
I had to add tensorboard loggger and pass it to pytorch_lightning trainer logger=logger
Is that normal?
@<1523701601770934272:profile|GiganticMole91> Thats rookie numbers. We are at 228 GB for elastic now
7 out of 30 GB is currently used and is quite stable
But there are stil some wierd issues, i cannot see the files uploaded in bucket
@<1523701070390366208:profile|CostlyOstrich36> Updated webserver and the problem still persists
This is the new stack:
WebApp: 1.15.1-478 • Server: 1.14.1-451 • API: 2.28
notice, we didnt update API (we had running experiments)
@<1523701435869433856:profile|SmugDolphin23> Any ideas how to fix this?
there is a typing in clearm.conf i sent you on like 87, there should be "key" not "ey" im aware of it
py file:
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
clearml.conf:
{
# This will apply to all buckets in this host (unless key/value is specifically provided for a given bucket)
host: " our-host.com "
key: "xxx"
secret: "xxx"
multipart: false
...
Hi, ok im really close now to working system
Debug image is uploading to s3, im seeing the files, all ok there
Problem now is viewing these images in web UI
Going to Debug Samples panel in Task drops me a popup to fill in s3 credentials
I cant figure out what the right setup is for the creds to work
This is what I have now (Note that we dont have region)