
Reputation
Badges 1
90 × Eureka!Did the shell script route work? I have a similar question.
It's a little more complicated because the index URL is not fixed; it contains the token which is only valid for a max of 12 hours. That means the ~/.config/pip/pip.conf
file will also need to be updated every 12 hours. Fortunately, this editing is done automatically by authenticating AWS codeartefact in the command line by logging in.
My current thinking is as follows:
Install the awscli
- pip install awscli
(c...
Oh great, thanks! Was trying to figure out how the method knows that the docker image ID belongs to ECR. Do you have any insight into that?
Just upgraded matplotlib, going to test now
Locally or on the remote server?
it just hangs when trying to upload. maybe that is the reason that the plots are not logging?
Yeah, it's not urgent. I will change the labels around to avoid this error 🙂 thanks for checking!
its a seaborn heatmap that needs to be plotted. not sure if that is useful at all
So how do I ensure that artefacts are uploaded in the correct bucket from within clearml?
SuccessfulKoala55 thanks for your help as always. I will try to create a DAG on airflow using the SDK to implement some form of retention policy which removes things that are not necessary. We independently store metadata on artefacts we produce, and mostly use clearml as the experiment manager, so a lot of the events data can be cleared.
using this method training_task.set_model_label_enumeration(label_map)
Ok, that explains a lot. The new user was using version 1.x.x and I was using version 0.17.x. That is why I think my task was being drafted. and his was being aborted.
There is no specific use case for draft mode - it was just the mode I knew that I understood to be used for enqueuing a newly created task, but I assume that aborted now has the same functionality
2021-03-01 20:51:55,655 - clearml.Task - INFO - Completed model upload to s3://15gifts-clearml/artefacts/pre-engine-traits/logistic-regression-paths-and-sales-tfidf-device-brand.8d68e9a649824affb9a9edf7bfbe157d/models/tfidf-logistic-regression-1614631915-8d68e9a649824affb9a9edf7bfbe157d.pkl *****
2021-03-01 20:52:01
2021-03-01 20:51:57,207 - clearml.Task - INFO - Waiting to finish uploads
Thanks maestro. Will give this a go
Where are you storing your secret JitteryCoyote63 ?
While we're here, how can I return the model accuracy (or any performance metric for that matter) given a model(s) belonging to a particular task? Is this information stored anywhere or do I need to explicitly log this data somehow?
How we resolved this issue is by developing a package to deal with the connecting and querying to databases. This package is then used inside the task for pulling the data from the data warehouse. There is a devops component here for authorising access to the relevant secret (we used SecretsManager on AWS). The clearml-agent instances are launched with role permissions which allow access to the relevant secrets. Hope that is helpful to you
Yes it does 🙂 I suspected this was the process. Thanks Jake. One last question, more so about the architecture design - is it advised to have the clearml server instance and a 'worker' instance listening to the queue as separate remote machines, or can I use the same instance for the web UI and and as a worker? I understand that processing pipelines may be compute intense enough to consume all resources and break the web UI, but I was wondering whether using a single large instance is a po...
And how will it know that the container is on ECR instead of some other container repository?
Any news on this bug?
Our model store consists of metadata stored in the DWH, and model artifacts stored in S3. We technically use ClearML for managing the hardware resource for running experiments, but have our own custom logging of metrics etc. Just wondering how tricky integrating a trigger would be for that
The task is dependent on a few artefacts from another task. Is there anything else I can do here?
On my local I have clearml 0.17.4
yes it does, but that requires me to manually create a new agent every time I want to run a different env no?
We are planning to use Airflow as an extension of clearml itself, for several tasks:
we want to isolate the data validation steps from the general training pipeline; the validation will be handled using some base logic and some more advanced validations using something like great expectations. our training data will be a snapshot from the most recent 2 weeks, and this training data will be used across multiple tasks to automate the scheduling and execution of training pipelines periodically e...
@<1687643893996195840:profile|RoundCat60> Hey Alex. Could you take a look at this when you're free later on please
Haha no not that much, I was just trying to play around with removing tasks etc, and didn't want to remove tasks created by co-workers.
Out of interest, is there a reason these are read-only? The code for these tasks is on github right?
Here is the error message from the consoleCollecting git+ssh://****@github.com/15gifts/py-db.git Cloning ssh://****@github.com/15gifts/py-db.git to /tmp/pip-req-build-xai2xts_ Running command git clone -q 'ssh://****@github.com/15gifts/py-db.git' /tmp/pip-req-build-xai2xts_ ERROR: Repository not found. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.