If i have an alternative location for the vscode, where should i indicate in the configuration?
We might need to add support for that, but it should not be a problem to override (e.g. downloadable link like http/s3/ etc.)
Is this something that is doable ?
Wait, so the pipeline step only runs if the pre execute callback returns True? It'll stop if it doesn't run?
Only if you have a Callback function, and that callback function returns False, then it will skip it (otherwise it will process it)
Another question, in the parents sequence in pipe.add_step, we have to pass in the name of the step right?
Correct, the step name is a unique identifier for the pipeline
how would I access the artifact of a previous step within the pre ...
Is this a common case? maybe we should change the run_pipeline_steps_locally
argument to False?
(The idea of run_pipeline_steps_locally=True
is that it will be easier to debug the entire pipeline on the same machine)
Thanks @<1523701868901961728:profile|ReassuredTiger98>
From the log this is what conda is installing, it should have worked
/tmp/conda_env1991w09m.yml:
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- blas~=1.0
- bzip2~=1.0.8
- ca-certificates~=2020.10.14
- certifi~=2020.6.20
- cloudpickle~=1.6.0
- cudatoolkit~=11.1.1
- cycler~=0.10.0
- cytoolz~=0.11.0
- dask-core~=2021.2.0
- decorator~=4.4.2
- ffmpeg~=4.3
- freetype~=2.10.4
- gmp~=6.2.1
- gnutls~=3.6.13
- imageio~=2.9.0
-...
have a CI/CD (e.g Github Actions) thats update my โproductionโ pipeline on ClearML UI,
I think this is the easiest way, basically the CI/CD launches a pipeline (which under the hood is another type of Task), by querying the latest "Published" pipeline that is also Not archived, then cloning+pushing it to execution queue.
In the UI when you want to "upgrade" the production pipeline you just right click "Publish" on the pipeline you want to launch. Another way is to do the same with Tags...
Hi TrickyRaccoon92
TKinter
is suddenly used as backend, and instead of writes to the dashboard I get popups per figure.
Are you running with an agent of manually executing the code ?
Yes, look for the clearml serving session ID in the web UI (just go to the home screen and put the UID in the search ๐ )
if it ain't broke, don't fix it
๐
Up to you, just a few features & nicer UI.
BTW: everything is backwards compatible, there is no need to change anything all the previous trains/trains-agent packages will work without changing anything ๐
(This even includes the configuration file, so you can keep the current ~/trains.conf and work with whatever combination you like of trains/clearml on the same machine)
Yes, actually the first step would be a toggle button for regexp in the search, the second will be even more advanced search.
May I suggest you post it on the UI suggestion issue https://github.com/allegroai/trains/issues/81 ?
Oh if this is the case you can probably do
` import os
import subprocess
from clearml import Task
from clearml.backend_api.session.client import APIClient
client = APIClient()
queue_ids = client.queues.get_all(name="queue_name_here")
while True:
result = client.queues.get_next_task(queue=queue_ids[0].id)
if not result or not result.entry:
sleep(5)
continue
task_id = result.entry.task
client.tasks.started(task=task_id)
env = dict(**os.environ)
env['CLEARML_TASK_ID'] = ta...
Maybe the configuration file changed?
None
The logic is if the name and project are the same, and there are no artifacts/models, and the last time it was created was under 72 hours, reuse the Task
for a TPU with more than 16GB GRAM and less than 40GB, so sometime we need to provision a A100 to get the training speed we want but we don't use all the GRAM
Oh that makes sense...
Just saw this one, this might help?
https://www.globenewswire.com/news-release/2022/10/24/2539924/0/en/ClearML-and-Genesis-Cloud-Announce-New-MLOps-Partnership-Delivering-100-Green-Energy-Compute-Solution-for-Machine-Learning.html
Hi @<1523701083040387072:profile|UnevenDolphin73>
How can I ensure tasks in a pipeline have the same environment as the pipeline itself?
...
but the tasks (executed remotely) do not use that same environment?
Just verifying, we are talking about pipeline decorators?
We also wanted this, we preferred to create a docker image with all we need, and let the pipeline steps use that docker image
You can specify the docker on the decorator itself:
[None](https://github.com/allegroai...
just got the pipeline to run
Nice!
using the default queue okay?
Using the default queue is fine. The different queue is the "services" queue that by default the "trains-server" is running an agent the will pull jobs from there.
With "services" mode, an agent will pull jobs right after the other (not waiting for the previous job to finish), as opposed to regular queue (any other) that the trains-agent will pull a job only after the previous one completed .
I'm not sure I follow the example... Are you sure this experiment continued a previous run?
What was the last iteration on the previous run ?
ClearML seems to store stuff that's relevant to script execution outside of clearml.Task
Outside of the cleaml.Task?
I guess i need to do something like the following after the task was created:
...
Yes!
Why use the "post" callback and not the "pre" callback?
The post get's back the Model object. The pre allows you to decide if you actually want to log in the first place (come to think about it, maybe you want that as well ๐ )
So could you re-explain assuming my piepline object is created by
pipeline = PipelineController(...)
?
pipe.add_step(name='stage_train', parents=['stage_process', ], monitor_artifact=['my_created_artifact'], base_task_project='examples', base_task_name='pipeline step 3 train model', parameter_override={'General/dataset_task_id': '${stage_process.id}'})
This will put the artifact names "my_created_artifact" from the step Tas...
ReassuredTiger98
Can you explain what you meant by
entropy point file?
There is no need to specify entry point file.
It is automatically detected when you run the Code manually on your machine.
My assumption was that the file "src/run_task.py" (based on your log) is just a test file, and hence was not added top the repository. So the agent failed to actually restore it from the git (files that are not added are not considered part of the git diff, this is usually git behavio...
Then it initiate a run on aws, which I want it to use the same task-id.
BoredPigeon26 Clone the Task, it basically creates a new copy (of the setup/configuration etc.)/
Then you can launch it on an aws instance (I'm assuming with clearml-agent)
wdyt?
But it write-over the execution tab in the gui
It does you are correct, it will however Not overwrite the reports (log scalars etc)
Hi @<1523701111020589056:profile|DefiantSpider5>
So there are two answers here, I'll start with the open-source version of both
Is there a way in clear ml to interactively view subsets of images based on a lasso of embedding plots
The ClearML Datasets have no "query" capabilities of the data inside the dataset. That means you can see preview images, statistics and download the datasets, but no query capabilities. On the other hand, there is no limitation on the type and format of me...
This might work (I have to admit I haven't had the time to test, please let me know if it works, so we could push it as a cool new feature ๐ )
` class LocalClearmlJob(ClearmlJob):
def init(self, *args, **kwargs):
super(LocalClearmlJob, self).init(*args, **kwargs)
def launch(self, queue_name=None):
# type: (str) -> bool
if self._is_cached_task:
return False
# create the subprocess
cmd = self.task.data.execution.script.ent...
oh, then this is user/pass (pass is the same as app key / secret)
None
JitteryCoyote63
I am setting up a new machine with two rtx 3070 GPU
Nice! you are one of the lucky few who managed to buy them ๐
Which makes me think that the wrong torch package is installed
I think that torch 1.3.1 is does not support cuda 11 ๐
Hi @<1570220858075516928:profile|SlipperySheep79>
Is there a way to specify the working dir from the decoratoe
not directly, but why would that change anything? I mean the coponent code will be created in the git root, and you can still access files inside the subfolders
from .subfolder import something
what am I missing?
ReassuredTiger98 maybe we should add an option to send a text next to the abort?
(Actually it is just a matter of passing the argument)
wdyt?
Could it be you have old OS environment overriding the configuration file ?
Can you change the IP of the server in the conf file, and make sure it has an effect (i.e. the error changed)?
Hi RoughTiger69
but still get the semantics of knowing when an (external) file changed?
How would you know it changed?
This implies you have a way to verify hash, which means you download the data , no?