Reputation
Badges 1
662 × Eureka!Great to hear @<1523701087100473344:profile|SuccessfulKoala55> ! Is there an estimated timeline for these releases?
I didn't mention code in #340 nor did I mention data here 😄 The idea was to package non git-specific files for remote execution
TimelyPenguin76 I added pip install --update clearml-agent to the extra_vm_bash_script for the autoscaler, that should at least guarantee the latest clearml agent is used on the instance, right?
Then the username and password would be visible in the autoscaler task 😕
But it should work out of the box, as it does work like that out of the box also regardless of ClearML. The user and personal access token are used as is and it propagates down to submodules, since those are simply another git repository.
I've further checks on a different machine and it works as well 🤔
They are set with a .env file - it's a common practice. The .env file is, at the moment, uploaded to a temporary cache (if you remember the discussion regarding the StorageManager ), so it's also available remotely (related to issue #395)
That could work, given that:
Could we add a preview section? One reason I don't like using the configuration section is that it makes debugging much much harder. Will the clearml-agent download and unzip the files, placing them into the same local folder as needed for execution? What if we want to include non-configuration objects? (i.e. the model case I listed)
Yes -- that's what I meant by The title is specified in the plot . I make the plots manually - title, axes labels, ticks, etc. In that sense, the figure is entirely configured. ClearML just saves it as "untitled 00/plot image"
For example, we have a complicated YAML file with built-in !include instructions, so we upload all the included files too. This then clogs up the artifacts sidebar, and it would be nice to be able to say "these are all artifacts from this one file, you can collapse it by clicking here"
I'll kill the agent and try again but with the detached mode 🤔
That could be a solution for the regex search; my comment on the pop-up (in the previous reply) was a bit more generic - just that it should potentially include some information on what failed while fetching experiments 😄
I also tried switching to dockerized mode now, getting the same issue 🤔
First bullet point - yes, exactly
Second bullet point - all of it, really. The SDK documentation and the examples.
For example, the Task object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.
Any linked example to github is welcome, but some visualization/inline code with explanation is also very much welcome.
The overall flow I currently have is e.g.
Start an internal task (not ClearML Task; MLOps not initialized yet) Call some pre_init function with args so I can upload the environment file via StorageManager to S3 Call some start_run function with the configuration dictionary loaded, so I can upload the relevant CSV files and configuration file Finally initialize the MLOps (ClearML), start a task, execute remotely
I can play around with 3/4 (so e.g. upload CSVs and configuratio...
BTW AgitatedDove14 following this discussion I ended up doing the regex way myself to sync these, so our code has something like the following. We abuse the object description here to store the desired file path.
` config_path = task.connect_configuration(configuration=config_path, name=config_fname)
included_files = find_included_files_in_source(config_path)
while included_files:
file_to_include = included_files.pop()
sub_config = task.connect_configuration(
configurat...
StaleButterfly40 what use case are you looking for? I've used environment variables in the config file and then I can overwrite them in os.environ before ClearML loads the config
You mean at the container level or at clearml?
Yes, the container level (when these docker shell scripts run).
The per user ID would be nice, except I upload the .env file before the Task is created (it's only available really early in the code).
@<1523701087100473344:profile|SuccessfulKoala55> could you provide some instructions?
We're using self hosted account
Thanks SuccessfulKoala55 and AgitatedDove14 ! We'll go through the hoops of setting up mongo on AWS then.
We're working to decouple the data from the helm chart, seems like a dangerous idea to store long term data on k8s in case of failure 😅
That's fine for the current use-case I believe.
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
Generally, really. I've struggled recently (and in the past), because the documentation seems:
Very complete wrt available SDK (though the formatting is sometimes off) Very lacking wrt to how things interact with one anotherA lot of what I need I actually find from pluging into the source code.
I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem (numpy, pandas, scipy, scikit-image, scikit-bio, scikit-learn, etc)
I created a new task with the project name internal tests , and no task name (so it's derived by ClearML).
The task was a simple print out.
The project does not appear in the project space and does not turn up on searches (the task does)
Now I tried setting pip version to <22.3 (both in the config and in the scaler's "extra config parameters"), but still it uses the latest?
added seed packages: pip==22.3.1, setuptools==65.5.1, wheel==0.38.4
One more UI question TimelyPenguin76 , if I may -- it seems one cannot simply report single integers. The report_scalar feature creates a plot of a single data point (or single iteration).
For example if I want to report a scalar "final MAE" for easier comparison, it's kinda impossible 😞
This could be relevant SuccessfulKoala55 ; might entail some serious bug in ClearML multiprocessing too - https://stackoverflow.com/questions/45665991/multiprocessing-returns-too-many-open-files-but-using-with-as-fixes-it-wh
My current approach with pipelines basically looks like a GH CICD yaml config btw, so I give the user a lot of control on which steps to run, why, and how, and the default simply caches all results so as to minimize the number of reruns.
The user can then override and choose exactly what to do (or not do).
It could be related to ClearML agent or server then. We temporarily upload a given .env file to internal S3 bucket (cache), then switch to remote execution. When the remote execution starts, it first looks for this .env file, downloads it using StorageManager, uses dotenv, and then continues the execution normally