What happens if you use the settings I pasted?
Hi @<1544853695869489152:profile|NonchalantOx99> , can you please add the full log?
Hi @<1582904448076746752:profile|TightGorilla98> , you would need to edit these addresses in mongodb/elastic for the new address. A migration script would do the job
Also, is it an AWS S3 or is it some similar storage solution like Minio?
Oh, can you please do the same with developer tools when user tries to accept?
AbruptWorm50 , please provide a log of the task 🙂
Hi, ZippyWalrus56 , can you add a full print of your console log?
Also if possible provide a code snippet, that can help understand the problem 🙂
Hi @<1750327622178443264:profile|CleanOwl48> , you need to set the output_uri in Task.init() for example to True to upload to the files server or to a string if you want to use s3 for example
Hi NaughtyFish36 ,
Execute a local training run (from within a docker container) which registers a task on our clearml serverWhen you do this, does ClearML detect the docker image that you're running on?
Initially there was the issue of no access via ssh, but that seemed to be fixed through mounting the local .ssh directory onto the docker container root. The subsequent error is the one above i.e. the reference is not a tree. However I can happily checkout that commit hash myself, yet t...
DeliciousBluewhale87 , Hi 🙂
You mean you created a dataset task on a certain server and you want to move that dataset task to another server?
Hi @<1590514584836378624:profile|AmiableSeaturtle81> , can you please add the log? What version of clearml are you using?
Hi NuttyCamel41 , what kind of additional information are you looking to report? What is your use case?
You shouldn't lose credentials. How exactly are you deploying your server? All of the related data to the server should be saved in one of the /opt/ folders as explained in the installation steps
Yeah, but how are iterations marked in the script?
Please follow the instructions.
Hi @<1523701304709353472:profile|OddShrimp85> , I assume you're running on top of K8s?
Looks like it is the issue indeed. You need a docker image that already has ssh or to install ssh in the image via the shell init script
@<1561885921379356672:profile|GorgeousPuppy74> , what docs did you find to be lacking?
VirtuousFish83 Hi 🙂
What versions are you running with? ClearML, ClearML-Agent, Torch, Lightning. Which OS are they run on and with what python version.
Do you maybe have a snippet to play around with to try and reproduce the issue?
I'm afraid it is not part of the open-source version. The PRO plan is pretty cheap (15 USD/month per user + some usages for applications & storage) compared to the price of the compute you're paying for on GCP. In the long run it would be saving money on idle machines and the time of your DevOps that need to raise/lower these machines all the time 🙂
Hi @<1780043419314294784:profile|LargeHamster21> , are you running multiple instances of the agent on the same machine? If that is the case, can you elaborate on the use case?
I think it tries to get the latest one. Are you using the agent in docker mode? you can also control this via clearml.conf with agent.cuda_version
Hi @<1739455989154844672:profile|SmarmyHamster62> , Are you sure about the version of ClearML? Can you share the entire log of the triton container?
@<1644147961996775424:profile|HurtStarfish47> , you also have the auto_connect_frameworks parameter of Task.init do disable the automatic logging and then manually log using the Model module to manually name and register the model (and upload ofc)
Great, thanks! Looking into it 🙂
Hi @<1558986867771183104:profile|ShakyKangaroo32> , you can do it but keep in mind that models/artifacts/debug samples are all referenced as links inside mongo/ES, you'd have to migrate the databases for that
Oh I see, I think TimelyPenguin76 worked with something similar.
I'll take a large snippet too 😛
Do you have any idea what's the source of this?TypeError: __init__() got an unexpected keyword argument 'configurations'
Hi @<1838387863251587072:profile|JealousCrocodile85> , are you referring to one of the examples? What steps did you take?