Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have An Issue, But Lets Start With The Description. This Is Snippet Of My Project'S Structure:

Hi, I have an issue, but lets start with the description. This is snippet of my project's structure:

├── .git/                      <- Git configuration
├── configs/                   <- Hydra configs
│   └── preprocessing/
│       └── config.yaml
└── steps/                     <- Python scripts
        ├── preprocess.py
        └── pipeline.py
  • In preprocess.py I use hydra with decorator @hydra.main(config_path="../configs/preprocessing/", config_name="config", version_base=None)
  • Pipeline.py is set as task-based pipeline
    Steps of my experiment:
  • I run 'python steps/preprocess.py'
  • I run 'python steps/pipeline.py'And everything works fine - pipeline is running based on task generated by preprocess.py

But then I wanted to get rid of git (I simply removed directory .git/) and it doesnt work anymore. The first step is running correctly but I got an issue while running the pipeline script - this time it is impossible to read the hydra config in task for preprocessing. I got an error "Check that the config directory '/tmp/configs/preprocessing' exists and readable". I have no idea where this path '/tmp/configs/preprocessing' comes from and why this entire issue is related to git.

It is somehow related to working directory path, script path or something like that: I have different value in 'SCRIPT PATH' in web ui. For example after running preprocessing.py having git the value in script path is 'steps/preprocess.py', but after running it the same way but without git the value is 'preprocess.py'.

Can you help me with that?

  
  
Posted one year ago
Votes Newest

Answers 8


@<1523701070390366208:profile|CostlyOstrich36> 1) I attached logs in text file. 2) I want to develop my code in docker container and theres no place for git (as far as I understand the purpose of using docker at all) - regardless of my use case, this type of simple pipeline should work with no git dependency, right? 3) I don't use the agent in this case

  
  
Posted one year ago

  • Attached verbose logs.
  • There's no point to run script with pipeline without clearml since the script is based on clearml feature (pipeline itself). All other scripts that can be run without clearml works fine.
    Hydra config file can't be read while running preprocess.py via pipeline, but it works good running it directly. (no git case)
  
  
Posted one year ago

@<1554638160548335616:profile|AverageSealion33> Can you run the script with HYDRA_FULL_ERROR=1 . Also, what if you run the script without clearml? Do you get the same error?

  
  
Posted one year ago

Hi @<1554638160548335616:profile|AverageSealion33> ! We pull git repos to copy the directory your task is running in. Because you deleted .git , we can't do that anymore. I think that, to fix this, you could just run the agent in the directory .git previously existed.

  
  
Posted one year ago

Hi @<1554638160548335616:profile|AverageSealion33> , can you add a log with and without git? What is the use case - why do you want to remove git? And just making sure - this is run through agent?

  
  
Posted one year ago

Thank you, it works perfectly now 🙂

  
  
Posted one year ago

@<1554638160548335616:profile|AverageSealion33> looks like hydra pulls the config relative to the scripts directory, and not the current working directory. The pipeline controller actually creates a temp file in /tmp when it pulls the step, so the script's directory will be /tmp and when searching for ../data , hydra will search in / . The .git likely caused your repository to be pulled, so your repo structure was created in /tmp , which caused the step to run correctly.
What you could do is this to make hydra search via cwd:

from clearml import Task
import os
import hydra


@hydra.main(config_path=os.path.join(os.getcwd(), "../configs/preprocessing/"), config_name="config", version_base=None)
def main(cfg):
    Task.init("hydra", "hydra")
    print(cfg)


if __name__ == "__main__":
    main()
  
  
Posted one year ago

@<1523701435869433856:profile|SmugDolphin23> I'm having trouble understanding your words. I don't use the agent here.

@<1523701435869433856:profile|SmugDolphin23> @<1523701070390366208:profile|CostlyOstrich36> Maybe you could look at my code ( None ) to get a better view of the case. (I really doubt that the look at the code will help to solve it since it is related to presence of git). Theres no clearml agent or any complex settings.

  
  
Posted one year ago