Reputation
Badges 1
662 × Eureka!Or if it wasn't clear, that chunk of code is from clearml's dataset.py
I'm guessing that's not on pypi yet?
Thanks David! I appreciate that, it would be very nice to have a consistent pattern in this!
Sounds like a nice idea 😁
Follow-up; any ideas how to avoid PEP 517 with the auto scaler? 🤔 Takes a long time to build the wheels
I guess it does not do so for all settings, but only those that come from Session()
We just inherit from logging.Handler
and use that in our logging.config.dictConfig
; weird thing is that it still logs most of the tasks, just not the last one?
Another example - trying to validate dataset interactions ends with
` else:
self._created_task = True
dataset_project, parent_project = self._build_hidden_project_name(dataset_project, dataset_name)
task = Task.create(
project_name=dataset_project, task_name=dataset_name, task_type=Task.TaskTypes.data_processing)
if bool(Session.check_min_api_server_version(Dataset.__min_api_version)):
get_or_create_proje...
It could be related to ClearML agent or server then. We temporarily upload a given .env file to internal S3 bucket (cache), then switch to remote execution. When the remote execution starts, it first looks for this .env file, downloads it using StorageManager, uses dotenv, and then continues the execution normally
I dunno :man-shrugging: but Task.init is clearly incompatible with pytest and friends
Basically when there are occasionally extreme values (i.e. most values fall in [0, 50] range, and one value suddenly falls in 50e+12 range), the plotting library (matplotlib or ClearML, unsure) hangs for a really long time
Without knowing anything, I'm assuming maybe ClearML patches plt.title
and not Axes.set_title
?
Same result 😞 This is frustrating, wtf happened :shocked_face_with_exploding_head:
This is also specifically the services queue worker I'm trying to debug 🤔
Say I have Task A that works with some dataset (which is not hard-coded, but perhaps e.g. self-defined by the task itself).
I'd now like to clone Task A and modify some stuff, but still use the same dataset (no need to recreate it, but since it's not hard-coded, I have to maintain a reference somewhere to the dataset ID).
Since the Dataset
SDK offers use_current_task
, I would have also expected there to be something like dataset.link(task)
or task.register_dataset(ds)
...
You mean at the container level or at clearml?
Yes, the container level (when these docker shell scripts run).
The per user ID would be nice, except I upload the .env
file before the Task
is created (it's only available really early in the code).
Nope, no other config files
I'm trying, let's see; our infra person is away on holidays :X Thanks! Uh, which configuration exactly would you like to see? We're running using the helm charts on K8s, so I don't think I have direct access to the agent configuration/update it separately?
This happened again 🤔
How many files does ClearML touch? :shocked_face_with_exploding_head:
I just ran into this too recently. Are you passing these also in the extra_clearml_conf
for the autoscaler?
Also something we are very much interested in (including the logger-based scatter plots etc)
If relevant, I'm using Chrome Version 101.0.4951.41 (Official Build) (64-bit)
I thought this follows from our previous discussion SuccessfulKoala55 , where this is a built-in feature of pyhocon?
AFAIK that's the only way right now (see my comment here - https://clearml.slack.com/archives/CTK20V944/p1657720159903739?thread_ts=1657699287.630779&cid=CTK20V944 )
Or then if you have the ClearML paid service, I believe there is a "vaults" service, right AgitatedDove14 ?
Right so this is checksum based? Are there plans to only store delta changes for files (i.e. store the changed byte instead of the entire file)?
Hm, this didn't happen until now; I'd be happy to try again with a new version, but something with 1.4.0 broke our StorageManager, so we reverted to 1.3.2
Yeah I figured (2) would be the way to go actually 😄
One must then ask, of course, what to do if e.g. a text refers to a dictionary configuration object? 🤔
Ah it already exists https://github.com/allegroai/clearml-server/issues/134 , so I commented on it
` # test_clearml.py
import pytest
import shutil
import clearml
@pytest.fixture
def clearml_task():
clearml.Task.set_offline_mode(True)
task = clearml.Task.init(project_name="test", task_name="test")
yield task
shutil.rmtree(task.get_offline_mode_folder())
clearml.Task.set_offline_mode(False)
class ClearMLTests:
def test_something(self, clearml_task):
assert True run with
pytest test_clearml.py `
Then the username and password would be visible in the autoscaler task 😕
But it should work out of the box, as it does work like that out of the box also regardless of ClearML. The user and personal access token are used as is and it propagates down to submodules, since those are simply another git repository.
I've further checks on a different machine and it works as well 🤔