My code is in classes, indeed. But I have more than one model. Actually, all the things that people store in for example yaml
or json
configs I store in python
files. And I do not want to statically import all the models/configs.
but of course, this is all largely dependent on your code and structure etc
I have a strange theory, that if the code is in classes, then you could include both in one .py file and then ENV["use_model"]="a" or ENV["use_model"]="b" to select between them .. in that way, you would clone the experiment and change the config and re-run
Stupid question Tim (and I understand that maybe your code is under NDA etc but) can you show the python code that you need to a/b against ?
Never a problem Tim.. although it does prompt me to try and figure out a/b model testing myself ... I see everything as a "potential blog post" ๐ ๐
Thanks a lot. To summarize: To me clearml is a framework, but I would rather have it be a library.
Other than that I am very happy with clearml and it is probably my favorite machine learning related package of the last two years! ๐ And thanks for taking so much time to talk to me!
Yes, the mechanisms under the hood are quite complex, the automagic does not come for "free" ๐
Anyhow, your perspective is understood. And as you mentioned I think your use case might be a bit less common. Nonetheless we will try to come-up with a solution (probably an argument for Task.init so you could specify a few more options for the auto package detection)
Butย
Task.create
ย is used byย
Task.init
Surprisingly , no ๐
Btw: I think Task.init
is more confusing than Task.create
and I would rather rename the former.
AgitatedDove14 Yes, you understood correctly. But Task.create
is used by Task.init
something like this, right?
` def init(project_name, task_name):
if not Task.exists_already(project_name, task_name):
task = Task.create(...)
else:
task = load_existing_task()
return task `
AlertBlackbird30 Thanks for asking. Just take everything with I grain of salt I say, because I am also not sure whether I do machine learning the correct way ๐
I think you got the right idea. I actually do reinforcement learning (RL), so I have multiple RL-environments and RL-agents. However, while the code for the agents differs between the agents, the glue code is the same. So what I do is I call python run_experiment.py --agent
http://myproject.agents.my _agent --environment
http://myprojects.environments.my _environment
or python run_experiment.py --agent myproject.agents.different_agent --environment myprojects.environments.also_different_env
.
It seems like the naming Task.create a lot of confusion (we are always open to suggestions and improvements). ReassuredTiger98 from your suggestion, it sounds like you would actually like more control in Task.init (let's leave Task.create aside, as its main function is Not to log the current running code, but to create an auxiliary Task).
Did I understand you correctly ?
Obviously in my examples there is a lot of stuff missing. I just want to show, that the user should be able to replicate Task.init
easily so it can be configured in every way, but still can make use of the magic that clearml has, for stuff that does not differ from the comfort way.
howdy Tim, I have tried to stay out of this, because a lot is going over my head (I am not a smart man ๐ but, one thing I wanted to ask, are you doing the swapping in and out of code to do a/b testing with your models ?! Is this the reason for doing this ? Because if so, I would be vastly more inclined to try and think of a good way to do that. Again, this maybe wrong, I am trying to understand the use case for swapping in and out code. ๐
I think such an option can work, but actually if I had free wishes I would say that the clearml.Task code would need some refactoring (but I am not an experienced software engineer, so I could be totally wrong). It is not clear, what and how Task.init
does what it does and the very long method declaration is confusing. I think there should be two ways to initialize tasks:
Specify a lot manually, e.g. task = Task.create() task.add_requirements(from_requirements_files(..)) task.add_entrypoint(...) ...
2. A comfort way just like it is now, but that internally looks like this:def init(project_name, task_name): #only necessary stuff task = Task.create() task.add_requirements(use_clearml_conf_method(..)) task.add_entrypoint(auto_determine_entrypoint) ....
But you can manually add them with Task.add_requirements, no?
In my opinion an ugly solution. I would have to keep track of which requirements are missing. Then I would rather just add all requirements manually.
There is a git issue for selecting "pip freeze" / auto analyze, we could add "use requirements.txt"
wdyt?
For now I come to the conclusion, that keeping aย
requirements.txt
ย and making clearml parse
Maybe we could just have that as another option?
if I use automatic code analysis it will not find all packages because ofย
importlib
.
But you can manually add them with Task.add_requirements, no?
I am still trying to solve the add_requirements
+ importlib
combo. If I use detect_with_freeze
I can not use add_requirements
and if I use automatic code analysis it will not find all packages because of importlib
.
For now I come to the conclusion, that keeping a requirements.txt
and making clearml parse the requirements from there should be the most robust solution. Unfortunately, there seems to be no way to do this with Task.init
.
I can then programmatically choose which file to import with importlib. Is there a way to tell clearml programmatically to analyze the files, so it can built up the requirements correctly?
Sadly no ๐
It analyzes the running code, then if it decides it is not a self contained script it will analyze the entire repo ...
I just saw thatย
Task.create
ย takes
Task.create
is Not Task.init. It is meant to allow you to create new Tasks (think Jobs) from your own code. A good example would be pipelines and automation where you want to create Tasks from a known codebase and send them for execution
BTW: any reason why you would not want to analyze the entire repo?
Or alternatively I just saw that Task.create
takes a requirements.txt
as an argument. This would also be fine for me, however I am not sure whether I should use Task.create
?
One last question then I have everything solved: Is it possible to pass clearml the files to analyze manually? For example my setup consists of a run_this.py
and this_should_be_run_A.py
and this_should_be_run_B.py
. I can then programmatically choose which file to import with importlib. Is there a way to tell clearml programmatically to analyze the files, so it can built up the requirements correctly?
Here is a nice hack for you:Task.add_requirements( package_name='carla', package_version="> 0 ; python_version < '2.7' # this hack disables the pip install")
This will essentially make sure the agent will skip the installation of the package, but at least you will know it is there.
However, because of the import carla
it is added to the task requirements and clearml-agent tries to install it, although it is meant to be included at runtime.
I have an carla.egg
file on my local machine and on the worker that I include with sys.path.append
before I can do
import carla
. It is the same procedure on my local machine and on the clearml-agent worker.
Isn't that risky? not knowing you need a package ?
How do you actually install it on the remote machine with the agent ?