Reputation
Badges 1
25 × Eureka!DeterminedCrab71 that is a good point, how does plotly adjust for nans on graphs?
I would suggest deleting them immediately when they're no longer needed,
This is the idea for the next RC, it will delete them after it is done using π
DilapidatedDucks58 You might be able to, check the links, they might be embedded into the docker, so you can map diff png file from the host π
BTW: what would you change the icons to?
Let me know if I understand you correctly, the main goal is to control the model serving, and deploy to your K8s cluster, is that correct ?
Do you have a roadmap which includes resolving things like this
Security SSO etc. is usually out of scope for the open-source platform as it really makes the entire thing a lot harder to install and manage. That said I know that on the Enterprise solution they do have SSO and LDAP support and probably way more security features. I hope it helps π
Assuming it was hashed, the seed would be stored on the same server, so knowing both would allow me the same access, no?
If that's the case you have two options:
- Create a Dataset from local/nfs and upload it to the S3 compatible NetApp storage (notice this create an immutable copy of the data)
- Create a Dataset and add "external links" to either the S3 storage with None
:port/bucket/...or direct file linkfile:///mnt/nfs/path, notice that in this example the system does not manage the data that means that if someone deletes/moves the data you are unaware of that And of course you can...
- Set hashed passwords withΒ
pass_hashed: true - Generate passwords usingΒ
python3 -c 'import bcrypt,base64; print(base64.b64encode(bcrypt.hashpw("password".encode(), bcrypt.gensalt())))'Β (obviously, replace "password" with the actual password). The resulting b64 string should be placed in the password field for each user.
For example, assuming your password is "123456": - bash:
> python3 -c 'import bcrypt,base64; print(base64.b64encode(bcrypt.hashpw("123456".encode(), bcrypt.gensal...
Plan is to have it out in the next couple of weeks.
Together with a major update in v0.16
SuperiorPanda77 I have to admit, not sure what would cause the slowness only on GCP ... (if anything I would expect the network infrastructure would be faster)
Hi GrievingTurkey78
Could you provide some more details on your use case, and what's expected?
Hi PompousParrot44
So do you mean something like:
` task_model_a = Task.get('id_a')
task_model_b = Task.get('id_b')
model_a_file = task_model_a.models['output][-1].get_local_copy()
model_b_file = task_model_b.models['output][-1].get_local_copy() `
NICE! MoodyCentipede68 this is awesome π
RC is out, SmugSnake6 please try withpip install clearml==1.7.2rc1
Hi JitteryCoyote63
What do you have in the agent.cuda_version ?
(you can see it printed at the beginning of the log)
The easiest if export_task / update_task:
https://allegro.ai/docs/task.html#trains.task.Task.export_task
https://allegro.ai/docs/task.html#trains.task.Task.update_task
Check the structure returned by export_task, you'll find the entire configuration test there,
then, you can use that to update back the Task.
BTW:
Partial update is also supported...
Hi ItchyJellyfish73
This seems aligned with scenario you are describing, it seems the api server is overloaded with simultaneous connections.
Add an additional apiserver instance to the docker-compose and an nginx as load balancer:
https://github.com/allegroai/clearml-server/blob/09ab2af34cbf9a38f317e15d17454a2eb4c7efd0/docker/docker-compose.yml#L4
`
apiserver:
command:
- apiserver
container_name: clearml-apiserver
image: allegroai/clearml:latest
restart: unless-sto...
Hi @<1538330703932952576:profile|ThickSeaurchin47>
Specifically Iβm getting the error βcould not access credentialsβ
Put your minio credentials here:
None
BattyLion34 is this consistent?
(Really I can't see eny difference, one time it is able to create the venv and another it is failing with permission error)
Yey!
My pleasure π
Is this like a local minio?
What do you have under the sdk/aws/s3 section ?
Hmm that should have worked ...
I'm assuming the Task itself is running on a remote agent, correct ?
Can you see the changes in the OmegaConf section ?
what happens when you pass--args overrides="['dataset.path=abcd']"
That is quite neat! You can also put a soft link from the main repo to the submodule for better visibility
hmm that is odd, it should have detected it, can you verify the issue still exists with the latest RC?pip3 install clearml-agent==1.2.4rc3
CrookedWalrus33
Force SSH git authentication, it will auto mount the .ssh from the host to the docker
https://github.com/allegroai/clearml-agent/blob/6c5087e425bcc9911c78751e2a6ae3e1c0640180/docs/clearml.conf#L25
Hi SmallDeer34
The any generally any pytorch.save(...) is logged/uploaded by clearml automatically. specifically in your case I think the only missing one is the trainer_sate.json, which I assume is general json file, and I imagine is part of huggingface framework. You can easily upload it as additional artifact with Task.upload_artifact wdyt?
Is it only for modified changes and not untracked files?
basically everything that "git diff" will output.
Then the agent will re-apply it on a remote machine