I can definitely feel you!
(I think the implementation is not trivial, metrics data size is collected and stored as commutative value on the account, going over per Task is actually quite taxing for the backend, maybe it should be an async request ? like get me a list of the X largest Tasks? How would the UI present it? As fyi, keeping some sort of book keeping per task is not trivial either, hence the main issue)
BoredHedgehog47 this is basically a wizard explaining the steps, see the 3 tabs 🙂
BTW, you can launch an experiment directly from CLI with clearml-task
https://clear.ml/docs/latest/docs/apps/clearml_task
I guess that was never the intention of the function, it just returns the internal representation. Actually my question would be, how do you use it, and why? :)
So just to be clear - the file server has nothing to do with the storage?
Think of it as a quick and dirty "minio", storing files and serving them over http. If you have minio (or any object storage) you can replace it all together 🙂
task.connect
is two way, it does everything for you:base_params = dict(param1=123, param2='text') task.connect(base_params) print(base_params)
If you run this code manually, then print is exactly what you initialized base_params
with. But when the agent is running it, it will take the values from the UI (including casting to the correct type), so print will result in values/types from the UI.
Make sense ?
Hi ShallowArcticwolf27
However, the AMI for version 0.16.1 has the following docker-compose file
I think we moved the docker-compose yaml when we upgraded from trains to clearml. Any reason your are installing the old docker-compose ?
HarebrainedBear62 this is what I have.
clearml-data will store all the files for you, and version the entire thing, make is a breeze to abstract the dataset from the code. Querying data is available using Apache Drill (though currently it is still not built into the platform, but we are planning to get there soon) Since this is Image based data/meta-data, I know the paid tier of ClearML, has n additional dedicated data management solution specifically for images, with full ability to query m...
Hi WickedGoat98
but is there also a way to delete them, or wipe complete projects?
https://github.com/allegroai/trains/issues/16
Auto cleanup service here:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
I lost you SmallBluewhale13 is this the Task init call you used:task = Task.init( project_name="examples", task_name="load_artifacts", output_uri="s3://company-clearml/artifacts/bethan/sales_journeys/", )
task.project
is the project ID (not name)task.get_project_name()
will return the project name
Hi AgitatedTurtle16
My question is how to use it to manage my experiments (docker containers). Simply put, let's say:
So basically once you see an experiment in the UI, it means you can launch it on an agent.
There is No need to containerize your experiment (actually that's kind of the idea, removing the need to always containerize everything).
The agent will clone the code, apply uncommitted changes & install the packages in the base-container-image at runtime.
This allows you to u...
GiganticTurtle0
I'm assuming here that self.dask_client.map(read_and_process_file, filepaths)
actually does the multi process/node processing. The way it needs to work, it has to store the current state of the process and then restore it on any remote node/process. In practice this means pickling the local variables (Task included).
First I would try to use a standalone static function for the map, DASK might be able to deduce it does not need to pickle anything, as it is standalone.
A...
https://github.com/allegroai/clearml/issues/199
Seems already supported for a while now ...
Hi PompousBeetle71
I remember it was an issue, but it was solved a while ago. Which Trains version are you using?
too large to be stored in the .cache path? It will be stored there anyway?
oh that is exactly why the latest release supports chunks, so you can get a partial copy 🙂
nonetheless, the assumption is that you will have to end up with the data locally, otherwise the network becomes a huge bottleneck
make sense ?
MysteriousBee56
Well we don't want to ask sudo permission automatically, and usually setups do no change, but you can diffidently call this one before running the agent 😉sudo chmod 777 -R ~/.trains/
I think this is the issue, it was search and replaced . The thing is I'm not sure the helm chart is updated to clearml. Let me check
Hi JumpyDragonfly13 , just making sure, do you have an agent running on a remote machine ?
Can you have a direct TCP connection to the remote machine (the default port it will use is 10022)
BTW:
Task.add_requirements('tensorflow', '2.2') will make sure you get the specified version 🙂
Is this still an issue (if you provide queue name, the default tag is not used so no error should be printed)
If the problem consists (i.e. trains failing to detect packages, please open a GitHub Issue so the bug will not get lost 🙂
One thought is to initialise a new clearML task in each fold to capture the iteration-level metrics, and then create another task/experiment at the end to capture the aggregated metrics across folds.
That is probably the easiest, and the most scalable.
BTW: with the mew reporting feature, you can integrate the comparison of the CV directly into your final report 🙂
ohh, could it be a 32bit version of python ?
seems to run properly now
Are you saying the problem disappeared ?
in my repo I maintain a bash script to setup a separate python env.
Hmm interesting, now I have to wonder what is the difference ? meaning why doesn't the agent build a similar one based on the requirements ?
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
🎉
While I do have the access and secret defined in clearml.conf, and even in the WebUI, I still get similar
and you have your credentials in the browser when deleting a Task ?
SubstantialElk6 I just realized 3 weeks passed, wow!
So the good news we have some new examples:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_functions.py
The bad news the documentation was postponed a bit, as we are still messaging the interface (the community is constantly pushing for great ideas and uses cases , and they are just too good to miss out 🙂 )...