Reputation
Badges 1
25 × Eureka!Hi SoreDragonfly16
Sadly no, the idea is to create full visibility to all users in the system (basically saying share everything with your colleagues) .
That said, I know the enterprise version have permission / security features, I'm sure it covers this scenario as well.
Do people use ClearML with huggingface transformers? The code is std transformers code.
I believe they do 🙂
There is no real way to differentiate between, "storing model" using torch.save
and storing configuration ...
GrievingTurkey78
maybe since the package is not directly imported in my code it is possible to get a different version to what I have locally (?).
If these are derivative packages (i.e. imported by other packages) they are not automatically logged when executing the Task manually (in order to keep the "installed packages as lean as possible on the one hand but specify also specify the important packages for you)
That said, when the "trains-agent" executed the task it will store nack...
Hi FunnyTurkey96
Which pip are you using, basically pip changed the dependency resolver after 20.1
Change: https://github.com/allegroai/clearml-agent/blob/aede6f4bac71c8fc56e7cf982318a48527953a3c/docs/clearml.conf#L57pip_version: "<20.2"
See if that helps
Basically it gives it direct access to the host, this is why it is considered less safe (access on other levels as well, like network)
SmarmySeaurchin8 regarding the original question:task.set_project(project_id)
Task.get_projects() to get all the project names/ids
SmugOx94 Yes, we just introduced it 🙂 with 0.16.3
Discussion was here (I'll make sure to update the issue that the version is out)
https://github.com/allegroai/trains/issues/222
In your trains.conf
add the following line:sdk.development.store_code_diff_from_remote = true
It will store the diff from the remote HEAD instead of the local one.
without the ClearML Server in-between.
You mean the upload/download is slow? What is the reasoning behind removing the ClearML server ?
ClearML Agent per step
You can use the ClearML agent to build a socker per Task, so all you need is just to run the docker. will that help ?
Maybe WackyRabbit7 is a better approach as you will get a new object (instead of the runtime copy that is being used)
Hi CurvedDolphin95
I would first check the free space on the instance (it might be that git is reporting an inaccurate error and it's free space not permission that causing it to fail the clone).
I would also check your GitHub account, notice that the now only support user/api-key (and not user/pass), which means you need to create an api-key and add it as your password in the clearml.conf.
Any chance that for some reason some of the Tasks are running from a diff user? or not using a docker ?
Hi JitteryCoyote63
Do you have a specific example in mind ?
Does a pipeline step behave differently?
Are you disabling it in the pipeline step ?
(disabling it for the pipeline Task has no effect on the pipeline steps themselves)
Quick update, I found the issue, working on a fix 🙂
trains was not able to pick the right wheel when I updated the torch req from 1.3.1 to 1.7.0: It downloaded wheel for cuda version 101.
Could you send a log, it should have worked 😞
JitteryCoyote63
I am setting up a new machine with two rtx 3070 GPU
Nice! you are one of the lucky few who managed to buy them 🙂
Which makes me think that the wrong torch package is installed
I think that torch 1.3.1 is does not support cuda 11 😞
I'm kind of at a point where I don't know a lot of what to even search for.
we feel you 💗 , yes there still isn't a very good source of information on where to get started...
This is because the entire field is constantly changing and evolving, and one solution will usually only apply to specific use case...
I would start with the mlops community slack channel, and youtube talks (specifically those by companies describe how they built their own internal infrastructure, i...
Yes 🙂 https://discuss.pytorch.org/t/shm-error-in-docker/22755
add either "--ipc=host" or "--shm-size= 8g " to the docker args (on the Task or globally in the clearml.conf extra_docker_args)
notice the 8g depends on the GPU
It seems to follow a structure specific to clearml,
Actually plotly.js 🙂
Hi GrievingTurkey78
How are you getting different version than what is used in run time? it analyzes the PYTHONPATH just as python does ? How can I reproduce it? Currently you can use Task.add_requirements(package_name, package_version=None)
This will not force it though, it is a recommendation (if it fails to find the package itself) maybe we can add force ?!What do you think?
Hmm good question, I'm actually not sure if you can pass 24GB (this is not a limit on the GPU memory, this affects the memblock size, I think)
mostly out of curiosity, what is the motivation behind introducing this as an environment variable knob rather then a flag with some default in Task.init?
DepressedChimpanzee34 we will deprecate the demo server (not exactly sure when) as we have the free community one that gives better service and stores the data. It was originally set for easy on-boarding and testing, but I think that now the user experience might be better with using the community free tier.
Make sense ? btw: what ...
WackyRabbit7 basically starting v1.1 if you are running code without any configuration file, you will get an error (in contrast to previous versions where it defaulted to the demo-server)
I assume so 🙂 Datasets are kind of agnostic to the data itself, for the Dataset it's basically a file hierarchy
Hi JitteryCoyote63
Just making sure, the package itself it installed as part of the "Installed packages", and it also installs a command line utility ?
Regrading the project name:
set_project will support project_name in the next version 🙂 project_id=[p.id for p in Task.get_projects() if p.name==project_name][0]