Reputation
Badges 1
25 × Eureka!Hi PompousBeetle71
Try this one, let me know if it helpedlogging.getLogger('trains.frameworks').setLevel(ERROR)
Hi @<1715175986749771776:profile|FuzzySeaanemone21>
and then run "clearml-agent daemon --gpus 0 --queue gcp-l4" to start the worker.
I'm assuming the docker service cannot spin a container with GPU access, usually this means you are missing the nvidia docker runtime component
Or can I enable agent in this kind of local mode?
You just built a local agent
NastyFox63 ask SuccessfulKoala55 tomorrow, I think there is a way to change the default settings even with the current version.
(I.e. increase the default 100 entries limit)
What are you seeing in the Task that was cloned (i.e. the one the HPO created not the original training task)?
by that I mean, configuration section, do you have the Args there ? (seems like the pic you attached, but I just want to make sure)
Also in the train.py file, do you also have Task.init ?
Hi EnviousStarfish54
You mean the console output ? if that's the case, the Task.init call will monkey patch the sys.stdout/sys.stderr to report to clearml
as well as the console
let me check a sec
SmallAnt76
see https://clear.ml/pricing/ , under "What plan should I choose?"
what you are looking for is the first column "open-source". make sense ?
TenseOstrich47 make sense π
RipeGoose2 yes, the UI cannot embed the html yet, but if you go click on the link itself it will open the html in a new tab.
Could you verify it works ?
BTW: how is it missing listing torch
? Do you have "import torch" in the code ?
Is it only for modified changes and not untracked files?
basically everything that "git diff" will output.
Then the agent will re-apply it on a remote machine
So what you are saying is the workers randomly report on one another's experiments ?
Hi @<1546303293918023680:profile|MiniatureRobin9> could it be the pipeline logic is created via the clrarml-task CLI? If this is the case, I think this is an edge case we should fix. Basically it creates a Task instead of pipeline, which in.essence only effects the UI. To solve it, just run the pipeline locally, notice that by default when you start it, it will actually stop the local run and relaunch itself on an agent.
Also, could you open a GitHub issue so we add a flag for it?
Whatβs the general pattern for running a pipeline - train model, evaluate metrics and publish the model if satisfactory (based on a threshold, for example)
Basically I would do:
parameters for pipeline:
TaskA = Training model Task (think of it as our template Task)
Metric = title/series/sign we want to choose based on, where sign is max/min
Project = Project to compare the performance so that we could decide to publish based on the best Metric.
Pipeline:
Clone TaskA Change TaskA argu...
This works.
great!
So it is still in master and should be included in 1.0.5?
correct, RC will be released soon with this fix included
Which clearml
version are you using ?
That is a bit odd, But SSH keys have to have a specific chmod flags for them to work (security issues)
What was the error ?
And after having called
Task.init()
the second time, the automatic logging of resources and tensorboard plots works as well. I would recommend adding explanation to the docs for
Oh yeah! you always need to call Task.init first, Task,current_task should be called from anywhere you like but after the Task.init was called.
JitteryCoyote63 the new wizard was pushed, you can check it out here:
https://github.com/allegroai/trains/blob/master/examples/services/aws-autoscaler/aws_autoscaler.py
BTW: next release to include it all is next week (hopefully :))
Hi SkinnyPanda43
I realized that the params are not being saved anymore
Could you test with clearml==1.0.4 ?
This means it will Always authenticate with SSH force_git_ssh_protocol
...
But it seems you need mixed behavior ?
Are you using github as git provider ?
Ohh, then yes, you can use the https://github.com/allegroai/clearml/blob/bd110aed5e902efbc03fd4f0e576e40c860e0fb2/clearml/automation/monitor.py#L10 class to monitor changes in the dataset/project
Could you test with the latest "cleaml"pip install git+
Task.add_requirement(".") should be supported now π
MagnificentSeaurchin79
"requirements.txt" is ignored if the Task has an "installed packges" section (i.e. not completely empty) Task.add_requirements('pandas') needs to be called before Task.init() (I'll make sure there is a warning if called after)