Hi @<1547390438648844288:profile|ScaryJellyfish75>
These hyperpaters are now in the "Args" section of my Clearml task
Sure that would probably mean
UniformParameterRange(
"Args/training/optimizer/lr",
min_value=0.00025,
max_value=0.01,
step_size=0.00025,
),
assuming your Task has training/optimizer/lr in its Args section (under configuration tab), make sense ?
VexedCat68 yes π you can also pass the parent folder and it will zip the entire subfolders into a single artifact
Notice both needs to be str
btw, if you need the entire folder just use StorageManager.upload_folder
Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?
if I want to compare two experiments the scalar plots do not load ( loading forever ).
I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?
I'm not sure the files-server supports "continue" from last position...
is it a shared network mount ? could you just delete the entire ~/.clearml on the host machine ?
CleanWhale17 per your request :)
An automated ML Pipeline π Automated Data Source Integration π Data Pooling and Web Interface for Manual Annotation of Images(Seg. / Classif) [Allegro Enterprise] or users integrate with open-source Storage of Annotation output files(versioned JSON) π Online-Training Β Support(for Dataset Shifts) [Not Sure what you mean] Data Pre-processessing (filter/augment) [Allegro Enterprise] or users integrate with open-source Data-set visualization(stats...
ChubbyLouse32 and this works when running python code and not when the agent is running ?
On the same machine ?
Yes, let's assume we have a task with id aabbcc
On two different machines you can do the following:trains-agent execute --docker --id aabbccThis means you manually spin two simultaneous copies of the same experiment, once they are up and running, will your code be able to make the connection between them? (i.e. openmpi torch distribute etc?)
Hi @<1523707131994312704:profile|CrabbyKoala94>
I wanted to use method Task.reset() or Task.delete() however none of that seems to be able to delete
only
the logs in the "console" section in the UI.
So Task.reset will reset the entire outputs of the Task (and the status), as you noticed. Why would you want to just remove the logs?
You can disable the auto logs altogether if you really want to, see Task.init [auto_connect_streams](https://github.com/allegroai/cl...
Hi JitteryCoyote63
Wait a few hours, there is a new fix, I'll make sure we upload it later today (scheduled to be there anyhow, I'll push it forward)
I can see all the steps like git clone,
git clone has nothing to do with "env setup" this is brining the code, you cannot skip that one, that said, this is why the git itself is cached on the host machine, so it is fast
... There may be some odd package that need to be installed because one of our DS is experimenting ... But all that we can see what is happening.
even if everything is preinstalled, it Verifies the packages match, this might take a long time. It's just pip being ...
WackyRabbit7
regular trains-agent modus operandi is one job at a time (i.e. until the Task is done, no other Tasks will be pulled from the queue).
When adding --services-mode, it is Not 1-1 but 1-N, meaning a single trains-agent will launch as many Tasks as it can.
The trains-agent pulls a job from the queue and spins a docker (only dockers are supported for the time being) and lets the job run in the background (the job itself will be registered as another "worker" in the system). Then the...
And your ~/clearml,conf ?
But I do not have anything linked correctly since I rely in conda installing cuda/cudnn for me
From the log it installed:cudatoolkit==11.1.1
based on the CUDA it found on the host machine: agent.cuda_version = 110
But for some reason it installed the pytorch from the conda "pytorch" repo without the cuda support.
If the same Task is run with different parameters...
ShinyWhale52 sorry, I kind of missed that in the explanation
The pipeline will always* create a new copy (clone) of the original Task (step), then modify the step's inputs etc.
The idea is that you have the experiment management (read execution management) to create full transparancy into the pipelines and steps. Think of it as the missing part in a lot of pipelines platforms where after you executed the pipeline you need to furthe...
Hi PanickyMoth78
` torch.save(net.state_dict(), PATH) # auto-uploads to GCS
get all the models from the Task
output_models = Task.current_task().models["output"]
get the last one
last_model = output_models[-1]
set meta-data
last_model.set_metadata(key="my key", value="my value", type="str") `
Hi @<1566596960691949568:profile|UpsetWalrus59>
just wondering - shouldn't the job still work if I didn't push the commit yet
How would that work? it does not know which commit to take? it would also fail on git diff apply, no?
you should see your agent there
instead of the one that I want or the one of the env which it is started from.
The default is the python that is used to run the agent.agent.ignore_requested_python_version = true agent.python_binary = /my/selected/python3.8
It completed after the max_job limit (10)
Yep this is optuna "testing the water"
Hi UnsightlyShark53 apologies for this delayed reply, slack doesn't alert users unless you add @ , so things sometimes get lost :(
I think you pointed at the correct culprit...
Did you manage to overcome the circular include?
BTW , how could I reproduce it? It will be nice if we could solve it
The package detection is done when running the code on your laptop, and this is when it first logs the packages and versions. Following it, what do you have on your laptop? OS/Conda/Python
What do you have in "server_info['url']" ?
Hmm, in the credentials popup there should be a "secure connect" checkbox, it tells it to use https instead of http. Can you verify?
restart the notebook kernel ?
Hi FancyWhale93 you can disable the auto model uploading with@PipelineDecorator.component(..., auto_connect_frameworks={'pytorch': False}) def step(): pass
ClearML does not work easily with Google Drive.
Yes, google drive is not google storage (which ClearML supports π )
Seems like you solved it?