Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Guys, How Does Allegro Keep Track Of The Requirements (I'M Running The Scripts On A Remote Train-Agent With

Hi guys,
How does allegro keep track of the requirements (I'm running the scripts on a remote train-agent with --docker )?
Is there a way to make him automatically install requirements.txt ?

  
  
Posted 3 years ago
Votes Newest

Answers 30


Nice, I didn't know that 🙂

  
  
Posted 3 years ago

SmugOx94

after having installed 

numpy==1.16

 in the first case or 

numpy==1.19

 in the second case. Is it correct?

Correct

the reason is simply that I'd like to setup an MLOps system where

I see the rational here (obviously one would have to maintain their requirements.txt)
The current way trains-agent works is that if there is a list of "installed packages" it will use it, and if it is empty it will default to the requirements.txt
We could have a flag (in trains.conf) saying, whatever you have in the requirements.txt , just ignore it.
What do you think?

  
  
Posted 3 years ago

Because at the moment I'm having a problem with the s3fs package where I have it in my requirements.txt but the import manager at the entry point doesn't install it

  
  
Posted 3 years ago

After the agent finished installing the "requirements.txt" it will put back the entire "pip freeze" into the "installed packages", this means that later we will be able to fully reproduce the working environment, even if packages change (which will eventually happen as we cannot expect everyone to constantly freeze versions)

This would be perfect

  
  
Posted 3 years ago

So I can set output_uri = "s3://<bucket_name>/prefix" and the local models will be loaded into the s3 bucket by ClearML ?

Yes, magic 🙂

  
  
Posted 3 years ago

While if I just download the right packages from the requirements.txt than I don't need to think about that

I see you point, the only question how come these packages are not automatically detected ?

  
  
Posted 3 years ago

No sorry maybe I wasn't clear, let me clarify
Suppose I have setup a Tranis server and a Trains agent (which uses docker to enforce reproducibility)

Consider I have a script script.py
` from trains import Task
import numpy as np

task = Task.init(project_name="my project", task_name="my task")
task.execute_remotely()

print(np.any_fuction(...)) UserA has a python environment with numpy==1.16 and launches script through python script.py UserB has a python environment with numpy==1.19 and launches script through python script.py `

If I understood correctly the script.py will be run on the remote train agent (in a docker container)
after having installed numpy==1.16 in the first case or numpy==1.19 in the second case. Is it correct?

What I think is worth is to have the chance to fix the requirements of a project (for example through a requirements.txt )
and then ensure that when python script.py is executed in that Trains agent, only requirements.txt will be used
(the reason is simply that I'd like to setup an MLOps system where the ouput of the experiments does not depend upon the local environment used to run the experiments, as is the case for script.py )

  
  
Posted 3 years ago

Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install?

  
  
Posted 3 years ago

if in the "installed packages" I have all the packages installed from the requirements.txt than I guess I can clone it and use "installed packages"

  
  
Posted 3 years ago

Yes exactly

  
  
Posted 3 years ago

LovelyHamster1 from the top, we have two steps:
We run the code "manually" (i.e. without the agent) this step create the experiment (Task) and automatically feels in the "installed packages" (which are in the same format as regular requirements.txt) An agent is running a cloned copy of the experiment (Task). The agents creates a new venv on the agent's machine, then the agent is using the "Installed packages" section as a replacement to regular "requirements.txt" and installs everything from the "installed packages" into the newly created Venv
The reason behind all of this:
Instead of relying on "requirements.txt" inside the git repo, that is rarely updated, we rely on the list of packages we collected in step (1), because we know that these packages worked for us

Make sense ?

  
  
Posted 3 years ago

Is is across the board for any Task ?
What would you expect to happen if you clone a Task that used the requirements.txt, would you ignore the full "pip freeze" and use the requirements .txt again, or is this thime we want to use the "installed packages" ?

  
  
Posted 3 years ago

Back to the feature request, if this is taken care of (both adding a missed package, and the S3 upload), do you still believe there is a room for this kind of feature?

Well, I can set import(s3fs) even if I don't really use it in my own code. One problem could be if this happen for a lot of packages, therefore I'd need to add this import to all my entry points of all my repos. While if I just download the right packages from the requirements.txt than I don't need to think about that

  
  
Posted 3 years ago

As an example, in Task.create() there is the possibility to install packages using a requirements.txt, and if not specified, it uses the requirements.txt of the repository. I'd like something like for Task.init() if possible

  
  
Posted 3 years ago

Back to the feature request, if this is taken care of (both adding a missed package, and the S3 upload), do you still believe there is a room for this kind of feature?

  
  
Posted 3 years ago

Hi AgitatedDove14 , I'm interested in this feature to run the agent and force it to install packages from requirements.txt. Is it available?

  
  
Posted 3 years ago

My problem right now is that Pytorch Lightning need the s3fs package to store model checkpoint into s3 buckets, but in my "installed packages" is not imported and I get an import error

  
  
Posted 3 years ago

"Pytorch Lightning need the s3fs " s3fs is not needed, let PL store the model locally and use "output_uri" to automatically upload the model to your S3 bucket.

So I can set output_uri = "s3://<bucket_name>/prefix" and the local models will be loaded into the s3 bucket by ClearML ?

  
  
Posted 3 years ago

No ok now I think I got how to use it, so "detect_with_pip_freeze" suppose that the instance launching remotely the clearml task has already all the packages installed inside pip and store them in the "installed packages". After this all the remote clearml-agents will install the packages included in "installed packages". Correct?

  
  
Posted 3 years ago

Hi LovelyHamster1 ,
you mean totally ignore the "installed packages" section, and only use the requirements.txt ?

  
  
Posted 3 years ago

Yes it does đź‘Ť Btw, at the moment I added import(s3fs) in my entry point and it's working, thank you!

  
  
Posted 3 years ago

SmugOx94 could you please open a GitHub issue with this request, otherwise we might forget 🙂
We might also get some feedback from other users

  
  
Posted 3 years ago

Yes! I think that would be great (and hopefully, helpful also for other people)
Thank you for your support!

  
  
Posted 3 years ago

Please let me know if my explanation is not really clear

  
  
Posted 3 years ago

Make sure you have the S3 credentials in your agent's clearml.conf

Ok this could be a problem, as right now I'm using ec2-instances with a instance-profile (I use it in the autoscaler) so they have by the default the right s3 permissions. But I'll try it anyway

  
  
Posted 3 years ago

LovelyHamster1
Also you can use pip freeze instead of the static code analysis , on your development machines set:
detect_with_pip_freeze: false
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/docs/clearml.conf#L169

  
  
Posted 3 years ago

Make sure you have the S3 credentials in your agent's clearml.conf :
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L210

  
  
Posted 3 years ago

I would like to force the usage of those requirements when running any script

How would you force it? Will you just ignore the "Installed Packages" section ?

  
  
Posted 3 years ago

if in the "installed packages" I have all the packages installed from the requirements.txt than I guess I can clone it and use "installed packages"

After the agent finished installing the "requirements.txt" it will put back the entire "pip freeze" into the "installed packages", this means that later we will be able to fully reproduce the working environment, even if packages change (which will eventually happen as we cannot expect everyone to constantly freeze versions)

My problem right now is that Pytorch Lightning need the s3fs package to store model checkpoint into s3 buckets, but in my "installed packages" is not imported and I get an import error

You can always manually add a package from code if it is missing, call Task.add_requirements before the Task.init call, of just add import 🙂 "Pytorch Lightning need the s3fs " s3fs is not needed, let PL store the model locally and use "output_uri" to automatically upload the model to your S3 bucket. This way not only you will later be able to switch to any other object storage, you will also have the ability to log it and download the model from the UI, and have better control over the S3 credentials / security LovelyHamster1 WDYT?

  
  
Posted 3 years ago

Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install

Not sure I follow... the "detect_with_pip_freeze" flag (when set) will tell clearml (at runtime) to create the "installed packages" directly from pip freeze (instead of analyzing the code)

  
  
Posted 3 years ago
591 Views
30 Answers
3 years ago
one year ago
Tags