Reputation
Badges 1
662 × Eureka!At least as far as I can tell, nothing else has changed on our systems. Previous pip versions would warn about this, but not crash.
For now we've monkey-patched it to our usecase:
` Dataset._Dataset__hidden_tag = "active"
def foo(cls, dataset_project, dataset_name):
dataset_project = dataset_project or "Datasets"
return dataset_project, dataset_project.rpartition("/")[0]
Dataset._build_hidden_project_name = foo `
That's fine for the current use-case I believe.
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
Would be great if it is 😍 We have few files that change frequently and are quite large in size, and it would be quite a storage hit to save all of them
A follow up question (instead of opening a new thread), is there a way I could signal some files/directories to be copied to the execute_remotely task?
I'm working on the config object references 😉
Sorry, I misspoke, yes of course, the agents config file, not the queues
In which repo?:)
The error seems to come from this line:self._driver = _FileStorageDriver(str(path_driver_uri.root)) (line #353 in clearml/storage/helper.py
Where if the path_driver is a local path, then the _FileStorageDriver starts with a base_path = '/' , and then takes extremely long time at iterating over the entire file system (e.g. in _get_objects , line #1931 in helper.py )
That still seems to crash SuccessfulKoala55 🤔
EDIT: No, wait, the environment still needs updating. One moment still...
And task = Task.init(project_name=conf.get("project_name"), ...) is basically a no-op in remote execution so it does not matter if conf is empty, right?
Heh, my bad, the term "user" is very much ingrained in our internal way of working. You can think of it as basically any technically-inclined person in your team or company.
Indeed the options in the WebUI are too limited for our use case, so we're developed "apps" that take a yaml configuration file and build a matching pipeline.
With that, our users do not need to code directly, and we can offer much more fine control over the pipeline.
As for the imports, what I meant is that I encounter...
I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.
Or is that functionality provided by setting offline mode and then importing an offline task?
Is it CLEARML_CONFIG_FILE ? (I had to dig this from the GH code 😅 )
Ah, the API server /users.get_all , I see!
This seems to be fine for now, if any future lookups finds this thread, btwwith mock.patch('clearml.datasets.dataset.Dataset.create'): ...
We’d be happy if ClearML captures that (since it uses e.g. pip, then we have the git + commit hash for reproducibility), as it claims it would 😅
Any thoughts CostlyOstrich36 ?
We have the following, works fine (we also use internal zip packaging for our models):
model = OutputModel(task=self.task, name=self.job_name, tags=kwargs.get('tags', self.task.get_tags()), framework=framework)
model.connect(task=self.task, name=self.job_name)
model.update_weights(weights_filename=cc_model.save())
It misses the repository information of course, but the 'configuration/Args' were logged. So something weird in identifying the repository
And last but not least, for dictionary for example, it would be really cool if one could do:my_config = task.connect_configuration(my_config, name=name) my_other_config = task.connect_configuration(my_other_config, name=other_name) my_other_config['bar'] = my_config # Creates the link automatically between the dictionaries
Let me know if there's any additional information that can help SuccessfulKoala55 !
This could be relevant SuccessfulKoala55 ; might entail some serious bug in ClearML multiprocessing too - https://stackoverflow.com/questions/45665991/multiprocessing-returns-too-many-open-files-but-using-with-as-fixes-it-wh
I'm saying it's a bug
Great, thanks! Any idea about environment variables and/or other files (CSV)? I suppose I could use the task.upload_artifact for the CSVs. but I'm still unsure about the environment variables