AgitatedDove14 Yes, I understand that, but as I mentioned earlier, I don't want to have to edit installed_packages
manually, as others will also be running the same scripts from their own local development machines. This is why I was inquiring about the requirements.txt
file, so that this manual editing would not have to be done by each person running the same scripts.
Additionally, since I'm running a pipeline, would I have to execute each task in the pipeline locally(but still connected to trains), then clone the all of the tasks and the pipeline controller and execute with trains-agent?
If there a way to do this without manually editing installed packages?
Running your code once with Task.init
should automatically detect all the directly imported packages, then when trains-agent
executes the Task, it will install them to a clean venv and put back all the packages inside the venv.
In order for all the used packages (e.g. bigquery) to appear in the "Installed packages" your cide needs to be executed once manually (i.e. not with trains-agent) then the trains
will detect them.
GiddyTurkey39 just making sure, you executed the code once manually and the "bigquery" package was missing form it ?
in the UI the installed packages will be determined through the code via the imports as usual ...
This is only in a case where a user manually executed their code (i.e. without trains-agent), then in the UI after they clone the experiment, they can click on the "Clear" button (hover over the "installed packages" to see it) and remove all the automatically detected packages. This will results in the trains-agent
using the "requirements.txt".
AgitatedDove14 Thanks! That resolved a lot of my questions.
Which would mean
trains
` will update the installed packages, no?
Can you explain what is meant by this?
I think what you're trying to say is that in the UI the installed packages will be determined through the code via the imports as usual and the users will have to manually clear the installed packages so that the requirements.txt
is used instead.
. So to conclude: it has to be executed manually first, then with trains agent?
Yes, that said, as you mentioned, you can always edit the "installed packages" once manually, from that point you are basically cloning the experiment, including the "installed packages" so it should work if the original worked.
Make sense ?
would I have to execute each task in the pipeline locally(but still connected to trains),
Somehow you have to have the pipeline step Task in the system, you can import it from code, or you can run it once, then the pipeline will clone it and reuse it. Am I missing something ?
If there a way to do this without manually editing installed packages? I'm trying to create a setup in which others can utilize the same repo and have their venvs be built similarly without having to do much work on their end (i.e. editing the installed packages). As for why it's failing to detect, I'm not sure.from google.cloud import bigquery ImportError: cannot import name 'bigquery' from 'google.cloud' (unknown location)
That's the error I get after the task makes use of the function contained within the other file in the repo. It's a standard import error but since it's able to detect the file in the first place it shouldn't be a problem with the repo structure?
Hi GiddyTurkey39
First, yes you can just edit the "installed packages" section and add any missing package (this is equal to requirements.txt)
I wonder why trains
failed detecting the "bigquery" package in the first place... Any thoughts ?
Since I'm running a pipeline, does this mean I have to execute each task of the pipeline individually and manually? Then uncomment task.execute_remotely()
from each task and then run via trains-agent
?
I didn't execute the code manually, I executed it from trains-agent to begin with. So to conclude: it has to be executed manually first, then with trains agent?
GiddyTurkey39
as others will also be running the same scripts from their own local development machine
Which would mean trains
` will update the installed packages, no?
his is why I was inquiring about the
requirements.txt
file,
My apologies, of course this is supported 🙂
If you have no "installed packages" (i.e. the field is empty in the UI) the trains-agent
will revert to installing the requirements.txt
from the git repo itself, then it will update back the actual installed package versions, so next time you will be able reproduce the same environment that worked
Does that solve the problem ?