Yes there was a bug that it was always cached, just upgrade the clearmlpip install git+
You mean the job with the exact same arguments ?
do you have other arguments you are passing ?
Are you using Optuna / HBOB ?
Hi BoredPigeon26
what do you mean by "reuse the task" ? is this manual execution (i.e. from code)?
How about archiving the old version?
You can also force Task.init to always create a new Task (which preserves the previous run alongside the execution tab)
Basically what's the specific use case ?
About .get_local_copy... would that then work in the agent though?
Yes it would work both locally (i.e. without agent) and remotely
Because I understand that there might not be a local copy in the Agent?
If the file does not exist locally it will be downloaded and cached for you
And if you could also update the docs with all env vars possible to set up it would awesome!
Yes, I'll pass it on, that is a good point
Thanks! Yes, this could be great !
Could you please open a GitHub issue, so we remember to update the feature ?
BTW: if you want to sync between artifacts / settings, I would recommend calling task.reload() to get the latest values back from the server.
GreasyPenguin14 makes total sense.
In that case I would say variants to the accuracy make sense to me, I would suggest:title='trains', series='accuracy/day' and title='trains', series='accuracy/night'
Regrading hierarchy, from the implementation perspective a unique identifier is always the combination of title/series (or in other words metric/variant), introducing another level is a system wide change.
This means it might be more challenging than expected ...
I am trying to use the
configuration vault
option but it doesn't seem to apply the variables I am using.
Hi EmbarrassedSpider34 I think this is an enterprise feature...
Manged to make the credentials attached to the configuration when the task is spinned,
I'm assuming env variables ?
Hi @<1523702868694011904:profile|AbruptCow41>
Check what are you getting when running git status inside the working directory, this is essentially how it works. Are you expecting to later run it with an agent?
Are you aware of any other way then (other than theย
secure: false
ย flag?
Actually self -signing and providing certificate file is already supported with boto (and thus clearml)
AWS_CA_BUNDLE
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
LudicrousParrot69 we are working on adding nested project which should help with the humongous mass the HPO can create. This is a more generic solution for the nesting issue. (since nesting inside a table is probably not the best UX solution ๐ )
Yes, the webserver doesn't know where the api server is, it will access /api and then the nginx running the webapp will do the routing (reverse proxy)
I think that for some reason it is failing to do that (actually similarly to the stackoverflow you linked)
Actually, no. This is ti spin the clearml-server on GCP, not the agent
I am using importlib and this is probably why everythings weird.
Yes that will explain a lot ๐
No worries, glad to hear it worked out
Hi @<1547390438648844288:profile|ScaryJellyfish75>
These hyperpaters are now in the "Args" section of my Clearml task
Sure that would probably mean
UniformParameterRange(
"Args/training/optimizer/lr",
min_value=0.00025,
max_value=0.01,
step_size=0.00025,
),
assuming your Task has training/optimizer/lr in its Args section (under configuration tab), make sense ?
CleanWhale17 per your request :)
An automated ML Pipeline ๐ Automated Data Source Integration ๐ Data Pooling and Web Interface for Manual Annotation of Images(Seg. / Classif) [Allegro Enterprise] or users integrate with open-source Storage of Annotation output files(versioned JSON) ๐ Online-Training ย Support(for Dataset Shifts) [Not Sure what you mean] Data Pre-processessing (filter/augment) [Allegro Enterprise] or users integrate with open-source Data-set visualization(stats...
ChubbyLouse32 and this works when running python code and not when the agent is running ?
On the same machine ?
Yes, let's assume we have a task with id aabbcc
On two different machines you can do the following:trains-agent execute --docker --id aabbccThis means you manually spin two simultaneous copies of the same experiment, once they are up and running, will your code be able to make the connection between them? (i.e. openmpi torch distribute etc?)
But I do not have anything linked correctly since I rely in conda installing cuda/cudnn for me
From the log it installed:cudatoolkit==11.1.1
based on the CUDA it found on the host machine: agent.cuda_version = 110
But for some reason it installed the pytorch from the conda "pytorch" repo without the cuda support.
you should see your agent there
Hi UnsightlyShark53 apologies for this delayed reply, slack doesn't alert users unless you add @ , so things sometimes get lost :(
I think you pointed at the correct culprit...
Did you manage to overcome the circular include?
BTW , how could I reproduce it? It will be nice if we could solve it
HandsomeCrow5client.events.debug_images(metrics=[dict(task='6adb929f66d14731bc76e3493ab89d80', metric='image')])
CourageousLizard33 Are you using the docker-compose to setup the trains-server?
Hi ScantChimpanzee51
btw: this seems like an S3 internal error
https://github.com/boto/s3transfer/issues/197