AstonishingWorm64 can you share the full log (In the UI under Results/Console there is a download button)?
How about this one:
None
What's the trains version / trains-server version ?
Hi @<1534706830800850944:profile|ZealousCoyote89>
We'd like to have pipeline A trigger pipeline B
Basically a Pipeline is a Task (of a specific Type), so you can have pipeline A function clone/enqueue the pipelineB Task, and wait until it is done. wdyt?
what format should I specify it
requirements.txt format e.g. ["package >= 1.2.3"]
Would this enforce that package on various components
This is a per component control, so you can have different packages / containers based on the componnent
Would it then no longer capture import statements?
This is replacing the auto detected packages, but obviously this fails to detect your internal repo package, which is the main issue here.
How is "internal package" installed, in o...
I want to be able to delete only the logs since they are taking a lot of space in my case.
I see... I do not think this is possible 😞
You can disable the auto logging though ... pass auto_connect_streams=False
to Task.init
but realized calling that from the extension would be hard, so we opted to have the TypeScript code make calls to the ClearML API server directly, e.g.
POST /tasks.get_all_ex
.
did you manage to get that working?
- To get the credentials, we read the
~/clearml.conf
file. I tried hard, but couldn't get a TypeScript library to work to parse the HOCON config file format... so I eventually resorted to using (likely brittle) regex to grab the ClearML endpoint and API ke...
I still can't get it to work... I couldn't figure out how can I change the clearml version in the runtime of the Cleanup Service as I'm not in control of the agent that executes it
Let's take a step back. Let's remove the clearml-services from the docker compose for a second, and run it manually (then you can control everything). Once you have it running manually, let's try to replicate the setup back to the docker compose, make sense ?
PipelineController works with default image, but it incurs overhead 4-5 min
You can try to spin the "services" queue without docker support, if there is no need for containers it will accelerate the process.
Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.
This error is about failing to clone the pipeline code repo, how is that connected to changing the container ?!
Can you provide the full log?
Hi @<1523701079223570432:profile|ReassuredOwl55>
I want to kick off the pipeline and then check completion
outside
of the pipeline task. (edited)
Basically the pipeline is a Task (of a certain type).
You do the "standard" thing, you clone the pipeline Task, you enqueue it, and you wait for it's status
task = Task.clone(source_task="<pipeline ID here>")
Task.enqueue(task, queue_name=services)
task.wait_for_status(...)
wdyt?
Hi GreasyPenguin14
It looks like you are trying to delete a Task that does not exist
Any chance the cleanup service is misconfigured (i.e. accessing the incorrect server) ?
Hi TightElk12
would like to understand the limitations of
Task.current_task()
Basically this will always get you an instance of the current Task. This will work from sub-processes as well as the main process. Is there a specific scenario you have in mind, or a challenge with the use case ?
5 seconds will be a sleep between two consecutive pulls where there are no jobs to process, why would you increase it to a higher pull freq ?
Hi SteadyFox10
Short answer no 😞
Long answer, full permissions are available in the paid tier, along side a few more advanced features.
Fortunately in this specific use case, the community service allows you to share a single (or multiple) experiments with a read-only link. Would that work ?
Where do you store those ?
from clearml import TaskTypes
That will only work if you are using the latest from the GitHub, I guess the example code was modified before a stable release ...
So I'm gusseting the cli will be in the folder of python:import sys from pathlib2 import Path (Path(sys.executable).parent / 'cli-util-here').as_posix()
now it stopped working locally as well
At least this is consistent 🙂
How so ? Is the "main" Task still running ?
I think this is great! That said, it only applies when you are spining agents (the default helm is for the server). So maybe we need another one? or an option?
CooperativeFox72 I would think the easiest would be to configure it globally in the clearml.conf (rather than add more arguments to the already packed Task.init) 🙂
I'm with on 60 messages being way too much..
Could you open a Github Issue on it, so we do not forget ?
I mean just add the toy tqdm loop somewhere just before starting the lightning train function. I just want to verify that it works, or maybe there is something in the specific setup happening in real-time that changes it
Sounds good to me 🙂
Also, I just wanted to say thanks for the tool! I'm managing a small data science practice and it's going to be really nice to have a view of all of the experiments we've got and know our GPU utilization, all without having to give every data scientist access to each box where the workflows are run. Incredibly stoked.
♥ ❤ ♥
The thing I don't understand is how come this DOES work on our linux setups
I do not think it actually works... I could not have find a code that will convert the ENV in the config string ...
I'll be happy to test it out if there's any commit available?
Please do, and feel free to PR it 😍
https://github.com/allegroai/clearml/blob/d3e986393ac8d1a1ea48302224962570ab8e6f9e/clearml/backend_api/session/session.py#L576
https://github.com/allegroai/clearml/blob/d3e98639...
Can you please tell me if you know whether it is necessary to rewrite the Docker compose file?
not by default, it should basically work out of the nox as long as you create the same data folders on the host machine (e.g. /opt/clearml)