FYI matplotlib imshow will create a debug image, and on complex plots the plot might get converted to image. (But shown under the plots section). All in all you might not be aware of it, but you are uploading image to your files server
What is the specific use case, updating a file on existing dataset and creating a new version?
ValueError: Missing key and secret for S3 storage access
Yes that makes sense, I think we should make sure we do not suppress this warning it is too important.
Bottom line missing configuration section in your clearml.conf
set the following:CLEARML_AGENT_DISABLE_SSH_MOUNT=1 clearml-agent daemon ...
The issue is, it will automatically mount the .ssh of the host into the container, so that if you are using SSH to clone git you have credentials, in your case, it also mounts the configuration, hence failing to login.
I will make sure we add it to the configuration file, so it is more visible
AFAIK that's the only way right now (see my comment here - https://clearml.slack.com/archives/CTK20V944/p1657720159903739?thread_ts=1657699287.630779&cid=CTK20V944 )
Or then if you have the ClearML paid service, I believe there is a "vaults" service, right AgitatedDove14 ?
Yep UnevenDolphin73 :)
Hmm... That's what happens with the exception of None/'' if type is str... There is no way to differentiate in the UI.
This is why we opted for type=str
will "cast" everything to str so you always get str, while not specifying a type will leave the variable as is... If you have an idea on how to support both, feel free to suggest 🙂
Hmm UnevenDolphin73 I just checked with v.1.1.6, the first time the configuration file is loaded is when calling Task.init (if not running with an agent, which is your case).
But the main point I just realized I missed 🤯"http://"${CLEARML_ENDPOINT}":8080"
The code does not try to resolve OS environments there!
Which, well, is a nice feature to add
https://github.com/allegroai/clearml/blob/d3e986393ac8d1a1ea48302224962570ab8e6f9e/clearml/backend_api/session/session.py#L576
should p...
Is it not possible to say just look at my requirements.txt file and the imports in the script?
I think there is a GitHub Issue for this feature
(basically the issue is, requirements.txt are very often not updated, and have no real version lock, so replicating a working env is always safer)
I think you are correct the env variable is not resolved in "time". It might be it's resolved at import not at Task.init
Hi @<1523704157695905792:profile|VivaciousBadger56>
No these are 3 different ways of building pipelines.
Creating from decorators is recommended when each component can be easily packages into a single function (every function can have an accompanying repository).
Here the idea it is very easy to write complex execution logic, basically the automagic does serialization/deserialization so you can write pipelines like you would code python.
Creating from Tasks is a good match if you need to ...
Correct 🙂
So inside the pipeline logic you can do Task.current_task().id
Or inside a component Task.current_task().parent
Yes that makes total sense to me. How about a GitHub issue on the clearml-docs ?
okay this points to an issue with the k8s glue, I think it somehow failed to launch the pod. Can you send me the log of the clearml-k8s-glue ?
You can make reports on experiments with interactive graphs
Yes, I can totally see how this is a selling point. The closest is the Project Overview (full markdown capabilities, with the ability to embed links to specific experiments). You can also add a "leader metric", so you can track the project performance/progress over time.
I have to admit that creating a better reporting tool is always pushed down in priority as I think this is a good selling point to management but the actual ...
Hi SmallDeer34
Did you call Task.init ?
So the only difference is how I log in into machine to start clear-ml
the only different that I can think of is the OS Environments in the two login types:
can you run export
in the two cases and check the diff between them?export
I've seen that the file location of a task is saved
What do you mean by that? is it the execution section "entry point" ?
Nothing except that Draft makes sense feels like the task is being prepped and Aborted feels like something went wrong
Yes guess that if we call execute remotely, without a queue, it makes sense for you to edit it...
Is that the case TrickySheep9 ?
If it is I think we should change it to draft when it is not queued. sounds good to you guys ?
I think this issue was fixed in clearml-server 1.3.0 (released after the weekend),
Let me check
EnviousPanda91 the host checks if you have a .ssh folder on the machine, if you do, it will copy+mount it into the container, then it will delete the copy when the container is down.
Specifically /tmp/clearml_agent.ssh.rbw8o0t7
is the copy of the .ssh that the agent created, and now it is mounting it into the container
Hi SkinnyPanda43
Let's say that I install the shared libs with pip in editable mode on my development evironment, how does the clearml-agent will handle those libraries if I submit a job
So installing packages from local folders with "-e" is in general ill-advised.
But using a full git path should work out of the box. for example if you install pip install
https://github.com/user/repo/repo.git then the agent will be able to install it on the remote machine. The main challenge...
Thanks RobustRat47 !
Should we put somewhere this requirement ? (i.e. nvidia drivers) ?
Is this really a must ?
Yes, consider VexedCat68 txt file the Dataset "content" , this will enable ypu to safely get the list of files, and then you can use the StorageManager to download them extend this concept and have it built into the Dataset itself, i.e. allow you to add files as links and make sure it will just download them. The caveat here is that the Dataset at the end, returns a folder with the files, when you specify links, you have to also specify the target location locally (at the end you want a fol...
VivaciousWalrus99 any chance the original Task was executed with python2 ?
what do you have for:ls -la /cs/usr/gal.hyams/.trains/venvs-builds/3.7/bin/
Hi DisgustedDove53
Is redis used as permanent data storage or just cache?
Mostly cache (Ithink)
Would there be any problems if it is restarted and comes up clean?
Pretty sure it should be fine, why do you ask ?
ok, but this happens in my local machine, not in the agent
resource monitoring is always running in the background, even on local machines. (of course you can turn it off)
WackyRabbit7
regular trains-agent modus operandi is one job at a time (i.e. until the Task is done, no other Tasks will be pulled from the queue).
When adding --services-mode, it is Not 1-1 but 1-N, meaning a single trains-agent will launch as many Tasks as it can.
The trains-agent pulls a job from the queue and spins a docker (only dockers are supported for the time being) and lets the job run in the background (the job itself will be registered as another "worker" in the system). Then the...