Reputation
Badges 1
90 × Eureka!Oh, that may work. Is there any docs/demos on this?
That is a neat way of making it work! Thanks Martin. Once I've added the SSH key to the deployment keys in that repo, then the change in the config should work right? I'm guessing the extra index URL can be a URL to the github repo of interest? (not another privately hosted pypi repo)
Hey Martin. We have managed to resolve this. FYI the issue was with the resolving of the host. It had to be changed from @github.com
to what the host is in the ssh config file!
While we are here - excuse my ignorance for now if this has already been stated in the docs ..
Is it possible to launch multiple clearml-agents on a dedicated clearml-agent server? I noticed that with one agent, only one task gets executed at one time
Ideally, I want to avoid re-inventing the wheel so if this functionality already exists with some examples then it would be great if someone could point me to it
are the envs named after the worker enumeration? e.g. venv-bulds-0 is linked to worker 0?
To report the metric to clearML, would that just be a batch update every t interval?
Yes it does 🙂 I suspected this was the process. Thanks Jake. One last question, more so about the architecture design - is it advised to have the clearml server instance and a 'worker' instance listening to the queue as separate remote machines, or can I use the same instance for the web UI and and as a worker? I understand that processing pipelines may be compute intense enough to consume all resources and break the web UI, but I was wondering whether using a single large instance is a po...
Awesome, thank you Jake! very helpful. For a lot of the models we run, we do not require GPU resources, so its good to know that a beefy instance should be able to run the experiments.
Thanks AnxiousSeal95 , will check it out! 🙂
Locally or on the remote server?
` # Plot the confusion matrix for predictions
sns.heatmap(
preds_confusion_percentage, annot=True, fmt=".3f", linewidths=.5,
square=True, cmap='Blues_r'
)
plt.ylabel('Actual label')
plt.xlabel('Predicted label')
title_str = f'Accuracy Score: {round(score, 2)}\n{TRANSFORM_TYPE}'
plt.title(title_str, size=15)
task.logger.report_matplotlib_figure(
title=f"Performance Heatmap - {model_export_name}",
series="Device Brand Predictions",
iteration=0,
figure=pl...
/home/ubuntu/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/clearml/utilities/plotlympl/mpltools.py:371: MatplotlibDeprecationWarning: The is_frame_like function was deprecated in Matplotlib 3.1 and will be removed in 3.3.
This is the last print statement before it hangs
I dont think its that. its a 20kb file upload. This was the last message just printedClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-star
Just upgraded matplotlib, going to test now
So do I set the report_image to False for the plots to appear in the plots tab?
If uploaded as image, what is the target destination for logging?
I will try add some print statements to test the hanging issue
Another update - the tasks run fine and installs the packages from the correct index URL. However, by default, py_db @ git ..
is added in the installed packages panel. Could this be from a requirements.txt
file somewhere? To get it to work, I have to remove the @ git part, and then it works. Just very strange that it defaults to git pip install 🤔
Reason I am asking is because we have servers with large RAM capacity, but minimal storage capacity, meaning that objects held in memory can sometimes surpass storage capacity if export is required
Using SSH credentials - replacing https url '
' with ssh url '
' Replacing original pip vcs 'git+
' with '
` '
Collecting py_db
Cloning ssh://@github.com/15gifts/py-db.git (to revision 851daa87317e73b4602bc1bddeca7ff16e1ac865) to /tmp/pip-install-zpiar1hv/py-db
Running command git clone -q 'ssh://@github.com/15gifts/py-db.git' /tmp/pip-install-zpiar1hv/py-db
2021-12-08 15:56:31
ERROR: Repository not found.
fatal: Could not read from remote repository.
Please...
Nope, from a remote server. It was that I had installed the package from git locally, so when pushing the task, clearml assumed it should also install from git. I since installed the package from the private pypi and it all works as expected now 🙂
Okay solved the problem. It is using the version that is locally installed (on my laptop). Is there a way to prevent this? Perhaps a requirements.txt or something like that>
I don't think we explicitly pass the package path to the agent. I expect it to run a regular pip install but it seems to be doing it via git somehow
This is included as part of the config file at ~/clearml.conf
on the clearml-agent
extra_docker_shell_script: [ "apt-get install -y awscli", "aws codeartifact login --tool pip --repository data-live --domain ds-15gifts-code", ]
Not sure how to get a log from the CLI but I can get the error from the clearml server UI, one sec
I think there is more complexity to what I am trying to achieve, but this will be a good start. Thanks!
I will need to log data set ID, transformer (not the NN architecture, just a data transformer), the model (with all hyperparameters & metadata) etc. and how all things link
New user is trying to push tasks, and the task is instantly changed to aborted from running
This is a suspicion only. It could be something else. In my case, there is no artifact or other config with a dict containing that key. Only the label map contains that key