
Reputation
Badges 1
90 × Eureka!Thanks GrumpyPenguin23 , will have a look shortly 🙂
Nope, from a remote server. It was that I had installed the package from git locally, so when pushing the task, clearml assumed it should also install from git. I since installed the package from the private pypi and it all works as expected now 🙂
Thanks Martin. I think I have found where the error is!
I can't figure out from the examples how the external trigger works. All of our model performance stats are in the DWH, and we want to build triggers based on that, Is that possible to integrate with Clearml triggers and schedulers?
Hey Martin. We have managed to resolve this. FYI the issue was with the resolving of the host. It had to be changed from @github.com
to what the host is in the ssh config file!
2021-03-01 20:51:55,655 - clearml.Task - INFO - Completed model upload to s3://15gifts-clearml/artefacts/pre-engine-traits/logistic-regression-paths-and-sales-tfidf-device-brand.8d68e9a649824affb9a9edf7bfbe157d/models/tfidf-logistic-regression-1614631915-8d68e9a649824affb9a9edf7bfbe157d.pkl *****
2021-03-01 20:52:01
2021-03-01 20:51:57,207 - clearml.Task - INFO - Waiting to finish uploads
I dont think its that. its a 20kb file upload. This was the last message just printedClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-star
The task is dependent on a few artefacts from another task. Is there anything else I can do here?
` # Plot the confusion matrix for predictions
sns.heatmap(
preds_confusion_percentage, annot=True, fmt=".3f", linewidths=.5,
square=True, cmap='Blues_r'
)
plt.ylabel('Actual label')
plt.xlabel('Predicted label')
title_str = f'Accuracy Score: {round(score, 2)}\n{TRANSFORM_TYPE}'
plt.title(title_str, size=15)
task.logger.report_matplotlib_figure(
title=f"Performance Heatmap - {model_export_name}",
series="Device Brand Predictions",
iteration=0,
figure=pl...
While we are here - excuse my ignorance for now if this has already been stated in the docs ..
Is it possible to launch multiple clearml-agents on a dedicated clearml-agent server? I noticed that with one agent, only one task gets executed at one time
How we resolved this issue is by developing a package to deal with the connecting and querying to databases. This package is then used inside the task for pulling the data from the data warehouse. There is a devops component here for authorising access to the relevant secret (we used SecretsManager on AWS). The clearml-agent instances are launched with role permissions which allow access to the relevant secrets. Hope that is helpful to you
Hey Martin. By labels map, I'm referring to the labels map assigned to the model. The one you can view in the models tab // labels
So how do I ensure that artefacts are uploaded in the correct bucket from within clearml?
Here is the error message from the consoleCollecting git+ssh://****@github.com/15gifts/py-db.git Cloning ssh://****@github.com/15gifts/py-db.git to /tmp/pip-req-build-xai2xts_ Running command git clone -q 'ssh://****@github.com/15gifts/py-db.git' /tmp/pip-req-build-xai2xts_ ERROR: Repository not found. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.
using this method training_task.set_model_label_enumeration(label_map)
Yeah, it's not urgent. I will change the labels around to avoid this error 🙂 thanks for checking!
Did the shell script route work? I have a similar question.
It's a little more complicated because the index URL is not fixed; it contains the token which is only valid for a max of 12 hours. That means the ~/.config/pip/pip.conf
file will also need to be updated every 12 hours. Fortunately, this editing is done automatically by authenticating AWS codeartefact in the command line by logging in.
My current thinking is as follows:
Install the awscli
- pip install awscli
(c...
One question - you can also set the agent.package_manager.extra_index_url
, but since this is dynamic, will pip install still add the extra index URL from the pip config file? Or does it have to be set in this agent config variable?
SuccessfulKoala55 thanks for your help as always. I will try to create a DAG on airflow using the SDK to implement some form of retention policy which removes things that are not necessary. We independently store metadata on artefacts we produce, and mostly use clearml as the experiment manager, so a lot of the events data can be cleared.
Oh great, thanks! Was trying to figure out how the method knows that the docker image ID belongs to ECR. Do you have any insight into that?
I can authorise CodeArtifact if I ssh into the server, and install the private package with no issues. Seems like something is forcing clearml-agent to use github cloning to install, rather than directly pip. Not sure if this is a configuration I have set up myself, or whether the server is configured to do this
In particular, I am trying to find a neat way to query all models available, and use tags to know the context. As it stands, I log the model accuracies/RMEs as part of the metadata, alongside the training data filepath. Issue is that this is not the neatest way of querying models across tasks without a lot of laborious manual lifting. Suggestions welcome
could it be how I am trying to log the figure manually?
And how will it know that the container is on ECR instead of some other container repository?
yes it does, but that requires me to manually create a new agent every time I want to run a different env no?
Our model store consists of metadata stored in the DWH, and model artifacts stored in S3. We technically use ClearML for managing the hardware resource for running experiments, but have our own custom logging of metrics etc. Just wondering how tricky integrating a trigger would be for that