Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
When Launching A Task To Trains Agent, I'M Having Trouble Getting The Imports From Other Files Working Correctly. For Instance, If My Task Imports A Function From Another File Within The Same Git Repo [

When launching a task to trains agent, I'm having trouble getting the imports from other files working correctly.
For instance, if my task imports a function from another file within the same git repo [
from other.file import my_function ]
and then within other.file there are imports, i.e. from google.cloud import bigquery .
Trains is not installing google-cloud-bigquery when creating the virtual environment. How can this be resolved? Do I need to have a requirements.txt file? If so, how can I make sure that trains should install packages from requirements.txt ?

  
  
Posted 3 years ago
Votes Newest

Answers 11


If there a way to do this without manually editing installed packages? I'm trying to create a setup in which others can utilize the same repo and have their venvs be built similarly without having to do much work on their end (i.e. editing the installed packages). As for why it's failing to detect, I'm not sure.
from google.cloud import bigquery ImportError: cannot import name 'bigquery' from 'google.cloud' (unknown location)That's the error I get after the task makes use of the function contained within the other file in the repo. It's a standard import error but since it's able to detect the file in the first place it shouldn't be a problem with the repo structure?

  
  
Posted 3 years ago

AgitatedDove14 Yes, I understand that, but as I mentioned earlier, I don't want to have to edit installed_packages manually, as others will also be running the same scripts from their own local development machines. This is why I was inquiring about the requirements.txt file, so that this manual editing would not have to be done by each person running the same scripts.

Additionally, since I'm running a pipeline, would I have to execute each task in the pipeline locally(but still connected to trains), then clone the all of the tasks and the pipeline controller and execute with trains-agent?

  
  
Posted 3 years ago

. So to conclude: it has to be executed manually first, then with trains agent?

Yes, that said, as you mentioned, you can always edit the "installed packages" once manually, from that point you are basically cloning the experiment, including the "installed packages" so it should work if the original worked.
Make sense ?

  
  
Posted 3 years ago

in the UI the installed packages will be determined through the code via the imports as usual ...

This is only in a case where a user manually executed their code (i.e. without trains-agent), then in the UI after they clone the experiment, they can click on the "Clear" button (hover over the "installed packages" to see it) and remove all the automatically detected packages. This will results in the trains-agent using the "requirements.txt".

  
  
Posted 3 years ago

would I have to execute each task in the pipeline locally(but still connected to trains),

Somehow you have to have the pipeline step Task in the system, you can import it from code, or you can run it once, then the pipeline will clone it and reuse it. Am I missing something ?

  
  
Posted 3 years ago

GiddyTurkey39

as others will also be running the same scripts from their own local development machine

Which would mean trains ` will update the installed packages, no?

his is why I was inquiring about the 

requirements.txt

 file,

My apologies, of course this is supported 🙂
If you have no "installed packages" (i.e. the field is empty in the UI) the trains-agent will revert to installing the requirements.txt from the git repo itself, then it will update back the actual installed package versions, so next time you will be able reproduce the same environment that worked

Does that solve the problem ?

  
  
Posted 3 years ago

Hi GiddyTurkey39
First, yes you can just edit the "installed packages" section and add any missing package (this is equal to requirements.txt)
I wonder why trains failed detecting the "bigquery" package in the first place... Any thoughts ?

  
  
Posted 3 years ago

I didn't execute the code manually, I executed it from trains-agent to begin with. So to conclude: it has to be executed manually first, then with trains agent?

  
  
Posted 3 years ago

Since I'm running a pipeline, does this mean I have to execute each task of the pipeline individually and manually? Then uncomment task.execute_remotely() from each task and then run via trains-agent ?

  
  
Posted 3 years ago

AgitatedDove14 Thanks! That resolved a lot of my questions.

Which would mean 

trains

` will update the installed packages, no?

Can you explain what is meant by this?
I think what you're trying to say is that in the UI the installed packages will be determined through the code via the imports as usual and the users will have to manually clear the installed packages so that the requirements.txt is used instead.

  
  
Posted 3 years ago

If there a way to do this without manually editing installed packages?

Running your code once with Task.init should automatically detect all the directly imported packages, then when trains-agent executes the Task, it will install them to a clean venv and put back all the packages inside the venv.

In order for all the used packages (e.g. bigquery) to appear in the "Installed packages" your cide needs to be executed once manually (i.e. not with trains-agent) then the trains will detect them.

GiddyTurkey39 just making sure, you executed the code once manually and the "bigquery" package was missing form it ?

  
  
Posted 3 years ago