AgitatedDove14 Thanks! That resolved a lot of my questions.
Which would mean
` will update the installed packages, no?
Can you explain what is meant by this?
I think what you're trying to say is that in the UI the installed packages will be determined through the code via the imports as usual and the users will have to manually clear the installed packages so that the
requirements.txt is used instead.
AgitatedDove14 Yes, I understand that, but as I mentioned earlier, I don't want to have to edit
installed_packages manually, as others will also be running the same scripts from their own local development machines. This is why I was inquiring about the
requirements.txt file, so that this manual editing would not have to be done by each person running the same scripts.
Additionally, since I'm running a pipeline, would I have to execute each task in the pipeline locally(but still connected to trains), then clone the all of the tasks and the pipeline controller and execute with trains-agent?
If there a way to do this without manually editing installed packages? I'm trying to create a setup in which others can utilize the same repo and have their venvs be built similarly without having to do much work on their end (i.e. editing the installed packages). As for why it's failing to detect, I'm not sure.
from google.cloud import bigquery ImportError: cannot import name 'bigquery' from 'google.cloud' (unknown location)That's the error I get after the task makes use of the function contained within the other file in the repo. It's a standard import error but since it's able to detect the file in the first place it shouldn't be a problem with the repo structure?
If there a way to do this without manually editing installed packages?
Running your code once with
Task.init should automatically detect all the directly imported packages, then when
trains-agent executes the Task, it will install them to a clean venv and put back all the packages inside the venv.
In order for all the used packages (e.g. bigquery) to appear in the "Installed packages" your cide needs to be executed once manually (i.e. not with trains-agent) then the
trains will detect them.
GiddyTurkey39 just making sure, you executed the code once manually and the "bigquery" package was missing form it ?
would I have to execute each task in the pipeline locally(but still connected to trains),
Somehow you have to have the pipeline step Task in the system, you can import it from code, or you can run it once, then the pipeline will clone it and reuse it. Am I missing something ?
in the UI the installed packages will be determined through the code via the imports as usual ...
This is only in a case where a user manually executed their code (i.e. without trains-agent), then in the UI after they clone the experiment, they can click on the "Clear" button (hover over the "installed packages" to see it) and remove all the automatically detected packages. This will results in the
trains-agent using the "requirements.txt".
. So to conclude: it has to be executed manually first, then with trains agent?
Yes, that said, as you mentioned, you can always edit the "installed packages" once manually, from that point you are basically cloning the experiment, including the "installed packages" so it should work if the original worked.
Make sense ?
as others will also be running the same scripts from their own local development machine
Which would mean
trains ` will update the installed packages, no?
his is why I was inquiring about the
My apologies, of course this is supported 🙂
If you have no "installed packages" (i.e. the field is empty in the UI) the
trains-agent will revert to installing the
requirements.txt from the git repo itself, then it will update back the actual installed package versions, so next time you will be able reproduce the same environment that worked
Does that solve the problem ?