Hi Everyone, Weird Problem With Dataset.Get_Local_Copy (Both From Sdk And From Clearml-Data): I Have A Dataset With A Single File And Lots Of S3 Links. Used To Work Perfectly Until Those Files Started Becoming Larger (Or It Is Just A Matter Of Bad Timing

Hi everyone, weird problem with Dataset.get_local_copy (both from sdk and from clearml-data):

I have a dataset with a single file and lots of s3 links. used to work perfectly until those files started becoming larger (or it is just a matter of bad timing). currently I get download failures for some of my files, consistently, with

  • err: Connection was closed before we received a valid response from endpoint URL
  • err: SSL validation failed for --[my file]-- [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007
    now why do I say it is weird? since the same file can be downloaded with StorageManager.get_local_copy
Posted 4 months ago
@<1523701435869433856:profile|SmugDolphin23> only set max_worker=1 and it seems to work. thanks!

Posted 4 months ago

okay I was prematurely happy. will update soon

Posted 4 months ago

will do and report back! thanks

Posted 4 months ago

Hi @<1523705721235968000:profile|GrittyStarfish67> ! This looks like a boto3 error. You could try lowering sdk.aws.s3.boto3.max_multipart_concurrency in clearml.conf and setting max_workers=1 when calling Dataset.get_local_copy

Posted 4 months ago
