Reputation
Badges 1
25 × Eureka!ReassuredTiger98
It seems like clearml is not able to fetch the dependencies correctly whenΒ
importlib
Β is used.
If you have an example please let me know we'll try to fix it :)
Is it possible to read the dependencies manually from a conda environment.yml?
You can set detect_with_conda_freeze: true in clearml.conf, it will just use the entire conda env
https://github.com/allegroai/clearml/blob/28b85028fe4da3ab963b69e8ac0f7feef73cfcf6/docs/clearml.conf#L170
yes i can communicate with the server, i managed to put tasks in the queue and retrieve them as well as running tasks with metrics reporting
Through the UI or python code ?
SmarmySeaurchin8 yes, the package containing the Controller is only RC, plan is to release the stable one in a couple of days. In the meantime:pip install git+
Hmm apparently it is not passed, but it could be.
Would the object itslef be enough to get the values? wouldn't it make sense to get them from outside somehow? (I'm assuming there is one set of args used at any certain moment?)
Hi HollowDolphin18
Sure just use:Task.set_credentials( api_host=None, web_host=None, files_host=None, key=None, secret=None, store_conf_file=False )https://github.com/allegroai/clearml/blob/912f6f5ba2328b26de042de03f02de5802df360f/clearml/task.py#L2153
Could it be the Args section of the task it clones does not have the "input_train_data" argument ?
Setting the credentials on agent machine means the users cannot use their own credentials since an k8s glue agent serves multiple users.
Correct, I think "vault" option is only available on the paid tier π
but how should we do this for the credentials?
I'm not sure how to pass them, wouldn't it make sense to give the agent an all accessing credentials ?
FriendlySquid61 could you help?
SmarmySeaurchin8 just so that I don't miss anything.
One machine, two trains-agents each one connected to a different trains-server, correct ?
from the trains-agent --helptrains-agent --config-file /home/user/my_trains_server1.conf daemon trains-agent --config-file /home/user/my_trains_server2.conf daemon
I am trying to see if the user can submit a list of resource requirements (e.g 4GPUs, 12 cores, 100GB diskspace)
This will be quite easy to implement using the cleamrl k8s glue, just use user-properties and change the template based on it. I can point to where you need to modify the code
and the inet of the same card ?
Hi @<1610083503607648256:profile|DiminutiveToad80>
I think we will need more context for the log...
but I think there is something wrong with the GCP resource configuration of your autoscaler
Can you send the full autoscaler log and the configuration ?
GrievingTurkey78 Actually it is in progress, see the GitHub issue for details:
https://github.com/allegroai/trains/issues/219
HugeArcticwolf77 changing the color is definitely a feature we will have in the next version, right now I think you cannot π it is randomly chosen based on the title/series and I think your example is a great failure case of that randomness π
So I had to add it explicitly via a docker init script
Oh yes, that makes sense, can't think of a better hack other than sys.path.append(os.path.join(os.path.dirname(__file__), "src"))
Ohh then we can definitely support it, could you maybe post a toy example for testing? Or even better PR it to the examples/tensorboardX folder?
Thanks! I think I was able to locate the issue, but I wanted to verify π
Oh that's definitely off π
Can you send a quick toy snippet to reproduce it ?
Hi PungentLouse55
Are you referring to the example code ?
LovelyHamster1 verified, this is a UI bug with old limitation enforced.
I will make sure they know about it, it should be fixed for the upcoming release π
Hi TenderCoyote78
I'm trying to clearml-agent in my dockerfile,
I'm not sure I'm following, Are you traying to create a docker container containing the agent inside? for what purpose ?
(notice that the agent can spin any off the shelf container, there is no need to add the agent into the container it will take of itself when it is running it)
Specifically to your docker file:
RUN curl -sSL
| sh
No need for this line
COPY clearml.conf ~/clearml.conf
Try the ab...
Q. Would someone mind outlining what the steps are to configuring the default storage locations, such that any artefacts or data which are pushed to the server are stored by default on the Azure Blob Store?
Hi VivaciousPenguin66
See my reply here on configuring the default output uri on the agent: https://clearml.slack.com/archives/CTK20V944/p1621603564139700?thread_ts=1621600028.135500&cid=CTK20V944
Regrading permission setup:
You need to make sure you have the Azure blob credenti...
GreasyPenguin14 the demo-server is soon to be deprecated, so we are slow on upgrades there. But you can already see it in the SaaS free tier.
https://app.community.clear.ml/
To auto upload the model you have to tell clearml to upload it somewhere, usually by passing output_uri to Task.init or setting the default_output_uri in the clearml.conf
Good, so we narrowed it down. Now the question is how come it is empty ?
where the ui merges the plots just as we want and I was wondering if there is some simple way to do it in the case of all plots.
we can do it for scalars (this is trivial)
We can merge specific plots when they are simple, I think basic histograms.
But for any generic plots we fear the merge will just fail, and this is why it defaults to side by side.
how can I combine two plots in the ui as you mentioned?
The easiest solution is to use, "report_scatter2d", these are specific pl...
Do you know how I can make sure I do not have CUDA or a broken installation installed?
I don't think this is the case, it is quite specifically installing the CPU version.
BTW: after the agent fails it will not remove the venv, so you can get into it and check, from the log it will be in: /home/tim/.clearml/venvs-builds/3.7
are you planning on changing to f-strings incrementally?
There is still py 2.7 & 3.5 support...
Hopefully we will be able to drop both (apparently enough users have legacy code), then we will probably switch to the nicer f' strings π
I think the easiest way is to add another glue instance and connect it with CPU pods and the services queue. I have to admit that it has been a while since I looked at the chart but there should be a way to do that