Reputation
Badges 1
88 × Eureka!I don't think agent are aware of each other. Which mean that you can have as many agent as you want and depending on your task usage, they will be fighting for CPU and GPU usage ...
Found the issue: my bad practice for import 😛
You need to import clearml before doing argument parser. Bad way:
import argparse
def handleArgs():
parser = argparse.ArgumentParser()
parser.add_argument('-c', '--config-file', type=str, default='train_config.yaml',
help='train config file')
parser.add_argument('--device', type=int, default=0,
help='cuda device index to run the training')
args = parser....
Artifact can be anything, that you can use clearml SDK to upload to storage. Which storage is used is defined by your clearml.conf (with its credentials) ClearML web and api server do not store those files
Model is a special artifact: None
Example you have the lineage feature where if you train model B using model A as starting point (aka pre-trained) , and model C from model B, ... The lineage will track modelC was built on...
Are you running within a zero-trust environment like ZScaler ?
Feels like your issue is not ClearML itself, but issue with https/SSL and certificate from your zero-trust system
so what was the solution/hack then ?
You are using CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL the wrong way
while the other may need to be 1
instead of true
@<1558986867771183104:profile|ShakyKangaroo32> If you just want something to run in regular period, have you consider TaskScheduler: None
Without clearml-session, how one could set this up ?? I cannot find any documentation/guide on how to do this ... The official doc seems to say: you start a code server that then connect to vscode.dev Then from your laptop, you go to vscode.dev in order to access to your code server. Is there anyway you do this but without going to vscode.dev ???
once you install manually your package inside the docker container, check that your file module_b/templates/my_template.yml
is where it should be
like for dataset_dir
I would expect a single path, not an array of 2 paths duplicated
you should be able to test your credential first using something like rclone or azure-cli
wow , did not know that vscode have a http "interface" !!! Make kind of sense as vscode is just a Chrome rendering webpage behind the scene ?
what is the difference between vscode via clearml-session and vscode via remote ssh extension ?
We use task.export_task()
and a hacked version to get console log:
def save_console_log(task: clearml.Task, fs, remote_path, number_of_reports=10000):
from clearml.backend_api.services import events
from clearml.backend_api import Session
# Stollen from Task.get_reported_console_output()
if Session.check_min_api_version('2.9'):
request = events.GetTaskLogRequest(
task=task.id,
order='asc',
navigate_earlier=True,
...
not sure how for debug sample and scalars ....
But theorically, with the above, one should be able to fully reproduce a run
so the issue is that for some reason, the pip install
by the agent don't behave the same way as your local pip install
?
Have you tried to manually install your module_b with pip install inside the machine that is running clearml-agent ? Seeing your example, looks like you are even running inside docker ?
you should know where your latest model is located then just call task.upload_artifact
on that file ?
if you are on github.com , you can use Fine tune PAT token to limit access to minimum. Although the token will be tight to an account, it's quite easy to change to another one from another account.
normally, you should have a agent running behind a "services" queue, as part of your docker-compose. You just need to make sure that you populate the appropriate configuration on the Server (aka set the right environment variable for the docker services)
That agent will run as long as your self-hosted server is running
(I never played with pipeline feature so I am not really sure that it works as I imagined ...)
Please refer to here None
The doc need to be a bit clearer: one require a path and not just true/false
1.12.2 because some bug that make fastai lag 2x
1.8.1rc2 because it fix an annoying git clone bug
just saw that repo: who are coder
? That not the vscode developer team is it ?