
Reputation
Badges 1
25 × Eureka!@<1595587997728772096:profile|MuddyRobin9> are you sure it was able to spin the EC2 instance ? which clearml version autoscaler are you running ?
GiganticTurtle0 is it just --stop that throws this error ?
btw: if you add --queue default
to the command line I assume it will work, the thing is , without --queue it will look for any queue with the "default" tag on it, since there are none, we get the error.
regardless that should not happen with --stop
I will make sure we fix it
Just so we do not forget, can you please open an issue on clearml-agent github ?
BTW: the cloning error is actually the wrong branch, if you take a look at your initial screenshot, you can see the line before last branch='default'
which I assume should be branch='master'
(The error itself is still weird, but I assume that this is what git is returning)
Yep, automatically moving a tag
No, but you can get the last created/updated one with that tag (so I guess the same?)
meant like the best artifacts.
So artifacts get be retrieved like a dict:
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts_retrieval.pyTask.get_task(project_name='examples', task_name='artifacts example').artifacts['name']
This is odd it says 1.0.0 but then, it was updated t weeks ago ...
It's a running number because PL is creating the same TB file for every run
Hi UnevenBee3
the optuna study is stored on the optuna class
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/optuna/optuna.py#L186
And actually you could store and restore it
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/optuna/optuna.py#L104
I think we should improve the interface though, maybe also add get_study(), wdyt?
- This then looks for a module called
foo
, even though itβs just a namespaceI think this is the issue, are you using python package name spaces ?
(this is a PEP feature that is really rarely used, and I have seen break too many times)
Assuming you have fromfrom foo.mod import
what are you seeing in pip freeze ? I'd like to see if we can fix this, and better support namespaces
Also, I just wanted to say thanks for the tool! I'm managing a small data science practice and it's going to be really nice to have a view of all of the experiments we've got and know our GPU utilization, all without having to give every data scientist access to each box where the workflows are run. Incredibly stoked.
β₯ β€ β₯
Hi @<1533257278776414208:profile|SuperiorCockroach75>
ModuleNotFoundError: No module named 'statsmodels'
seems like this package is missing from the Task
wither import it manually import statsmodels
(so the automagic logs it)
Or add before task init:
Task.add_requirements("statsmodels")
task = Task.init(...)
ps: no need to @ so many people ...
Hi PerplexedGoat65
it appears, in a practical sense, this means to mount the second drive, and then bind them in ClearMLβs configuration
Yes, the entire data
folder (reason is, if you loose it, you loose all the server storage / artifacts)
Also, thinking about Docker and slower access speed for Docker mounts and such,
If the host OS is linux, you have nothing to worry about, speed will be the same.
(also im a bit newer to this world, whats wrong with openshift?)
It's the most difficulty Kubernetes flavor to work with π
weve already tried that but it didnt really change ...
Can you provide full log? as well as how you created the pods ?
Sure thing! this feature is all you guys, ask and shall receive π
git config --system credential.helper 'store --file /root/.git-credentials'
Maybe we should use this hack for cloning with user/token in general ...
You actually have to login/ssh under said user, have another dedicated mountpoint and spin the agent from that user.
. Is there any known issue with amazon sagemaker and ClearML
On the contrary it actually works better on Sagemaker...
Here is what I did on sage maker, created:
created a new sagemaker instance opened jupyter notebook Started a new notebook conda_python3 / conda_py3_pytorchIn then I just did "!pip install clearml" and Task.init
Is there any difference ?
LOL EnormousWorm79 you should have a "do not show again" option, no?
Thanks HelpfulHare30 , I would love know know what you find out, please feel free to share π
If the right properties are set can the profile tab be added?
I guess that is doable, that said some of the graphs are not straight forward to support like this one:
https://www.tensorflow.org/guide/images/tf_profiler/trace_viewer.png
Hi BoredHedgehog47
Just make sure it is installed as part of the "installed packages" π
You should end up with something likegit+
You can actually add it from your code:Task.add_requirements("git+
") task = Task.init(...)
Notice you can also add a specific commit or branch git+
https://github.com/user/repo.git@ <commit_id_here_if_needed>
Is this what you are looking for ?
EDIT:
you can also do "-e ." that should also work:
` Task.add_requirements("-e .")
task = Ta...
What we would like ideally, is a system where development, training, and deployment are almost one and the same thing, to reduce the lead time from development code to production models.
This is very aligned with the goals of ClearML π
I would to understand more on what is currently missing in ClearML so we can better support this approach
my inexperience in using them a lot until recently. I can see how that is a better solution
I think I failed in explaining my self, I me...
Sounds good.
BTW, when the clearml-agent is set to use "conda" as package manager it will automatically install the correct cudatoolkit on any new venv it creates. The cudatoolkit version is picked direcly when "developing" the code, assuming you have conda installed as development environment (basically you can transparently do end-to-end conda, and not worry about CUDA at all)
So when the agent fire up it get's the hostname, which you can then get from the API,
I think it does something like "getlocalhost", a python function that is OS agnostic
or even different task types
Yes there are:
https://clear.ml/docs/latest/docs/fundamentals/task#task-types
https://github.com/allegroai/clearml/blob/b3176a223b192fdedb78713dbe34ea60ccbf6dfa/clearml/backend_interface/task/task.py#L81
Right now I dun see differences, is this a deliberated design?
You mean on how to use them? I.e. best practice ?
https://clear.ml/docs/latest/docs/fundamentals/task#task-states
I find it quite difficult to explain these ideas succinctly, did I make any sense to you?
Yep, I think we are totally on the same wavelength π
However, it also seems to be not too prescriptive,
One last question, what do you mean by that?