yes, that's exact what I'm concerned. Maybe it's the problem, because we're sharing the same CLearml for several projects and we log many things in the console, Clearml captures all.
But you can see in the log that it manages to connect for a bit and then its interrupted
Not sure, I would ask the DevOps team to check
you mean too many connection to the file server then I could not connect more?
Looks like you're having some connectivity to the files server
2024-11-14 07:05:30,888 - clearml.storage - INFO - Uploading: 5.00MB / 12.82MB @ 35.88MBs from /tmp/state.vykhyxpt.json
2024-11-14 07:05:31,111 - clearml.storage - INFO - Uploading: 10.00MB / 12.82MB @ 22.36MBs from /tmp/state.vykhyxpt.json
1731567938707 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff76aaaf310>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff76aaaf5b0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff76aa8aa60>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff76aa8aca0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
1731567942714 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff763eb5460>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff763eb5490>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff763eb3310>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff763eb35e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
1731567943959 labserver info Uploading dataset changes (30545 files compressed to 13.07 MiB) to
1731567944758 labserver info 2024-11-14 07:05:44,758 - clearml.storage - INFO - Uploading: 5.00MB / 8.33MB @ 44.04MBs from /tmp/state.gkewkbrp.json
2024-11-14 07:05:44,805 - clearml.storage - INFO - Uploading: 5.00MB / 13.07MB @ 28.98MBs from /tmp/dataset.4e49382b63ef4e4aa5189a3e19dc4f03.z5prqb9n.zip
2024-11-14 07:05:45,218 - clearml.storage - INFO - Uploading: 10.00MB / 13.07MB @ 12.10MBs from /tmp/dataset.4e49382b63ef4e4aa5189a3e19dc4f03.z5prqb9n.zip
File compression and upload completed: total size 13.07 MiB, 1 chunk(s) stored (average size 13.07 MiB)
1731567945994 labserver info 2024-11-14 07:05:45,994 - clearml.storage - INFO - Uploading: 5.00MB / 8.33MB @ 64.37MBs from /tmp/state.11sha1i0.json
1731567947091 labserver info 2024-11-14 07:05:47,090 - clearml.storage - INFO - Uploading: 5.00MB / 8.33MB @ 65.74MBs from /tmp/state.fzbvb2c2.json
Updating statistics and genealogy
1731567950720 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff77167c700>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff77167c9d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff77166a970>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff77166a460>: Failed to establish a new connection: [Errno 111] Connection refused')': /
1731567953731 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff77264a460>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764fe7070>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764fe7520>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764fe7fd0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
1731567957737 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764fe7940>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764fe7ee0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764ff8460>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=1, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764ff8a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /
1731567965748 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764ff8640>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764ff8e50>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff764ff8550>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=0, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff76b0dc3d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
1731567968761 labserver error WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff762e756d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff7dffc8dc0>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff7e6df0250>: Failed to establish a new connection: [Errno 111] Connection refused')': /
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff7dfe99610>: Failed to establish a new connection: [Errno 111] Connection refused')': /
I'd suggest checking that
@<1523701070390366208:profile|CostlyOstrich36> Thanks for quickly response.
Are you trying to upload data to the files server
Yes, I 'm trying to upload to the files server using
The code I'm using:
ds = Dataset.create(
dataset_name="xxxx",
dataset_project=f"A/B/C",
dataset_tags=[task_tag],
)
ds.add_files(path=ng_path)
ds.upload()
ds.finalize(verbose=True)
The full log is the attachment.
Hi @<1638712150060961792:profile|SilkyCrocodile89> , it looks like a connectivity issue. Are you trying to upload data to the files server? Can you share the full log?