Reputation
Badges 1
113 × Eureka!do you mean having the ClearML FileServer store on azure blob instead of on the local drive?
Yes, that is what I wanted.
If so, that's not possible. You can however point the fileserver data folder to some mounted folder - if you have something that can create a mount from a filesystem folder to azure blob, it will work (the file server will always treat it as a local file system)
Thanks for confirming that it's the only solution. 👍
You can either set your user permission to allow group write by default ?
Or maybe create a dedicated user with group write permission and run the agent with that user ?
how di you provide credentials to clearml and git ?
I saw that page ... but nothing about number of worker of a queue .... or did I miss it ?
@<1523701087100473344:profile|SuccessfulKoala55> I can confirm that v1.8.1rc2 fixed the issue in our case. I manage to reproduce it:
- Do a local commit without pushing
- Create task and queue it
- The queue task failed as expected as the commit is only local
- Push your local commit
- Requeue the task
- Expecting that the task succeeed as the commit is avail: but it fails as the vcs seems to be in weird state from previous failure
- Now with v1.8.1rc2 the issue is solved
You are using CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL the wrong way
Deployed fresh new and ran nginx -T in the container:
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
# configuration file /etc/nginx/nginx.conf:
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
error_log stderr notice;
events {
worker_connections 768;
# multi_accept on;
}
http {
client_max_body_size 100M;
rewrite_l...
if you want plot, you can simply generate plot with matplotlib and clearml can upload them in the Plot or Debug Sample section
Please refer to here None
The doc need to be a bit clearer: one require a path and not just true/false
Sure:
def main():
repo = "redacted"
commit = "redacted"
commit = "redacted"
bands = ["redacted"]
test_size = 0.2
batch_size = 64
num_workers = 12
img_size = (128, 128)
random_seed = 42
epoch = 20
learning_rate = 0.1
livbatch_list = get_livbatch_list(repo, commit)
lbs = download_batches(repo, commit, livbatch_list)
df, label_map = get_annotation_df(lbs, bands)
df_train, df_val = deterministic_train_val(df, test_size=test_siz...
with
df = pd.DataFrame({'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
import clearml
task = clearml.Task.current_task()
task.get_logger().report_table(title='table example', series='pandas DataFrame', iteration=0, table_plot=df)
# logger.report_table(title='table example',series='pandas DataFrame',iteration=0,tabl...
I know that git clone and pip verify all installed is normal. But for some reason in Michael screenshot, I don't see those steps ...
had you made sure that the agent inside GCP VM have access to your repository ? Can you ssh into that VM and try to do a git clone ?
@<1523701087100473344:profile|SuccessfulKoala55> I managed to make this working by:
concat the existing OS ca bundle and zscaler certificate. And set REQUESTS_CA_BUNDLE to that bundle file
python library don't always use OS certificates ... typically, we have to set REQUESTS_CA_BUNDLE=/path/to/custom_ca_bundle_crt because requests ignore OS certificates
yup, you have the flexibility and option, that what so nice with ClearML
ClearML staff may have better solution as I am not familiar with the docker mode
Is it because your training code download the pretrain model from pytorch or whatever, to local disk in /tmp/xxx then train from there ? so ClearML will just reference the local path.
I think you need to manually download the pre-train model, then wrap it with Clearml InputModel (eg here )
And then use that InputModel as pre-train ?
May be clearml staffs have better approach ? @<152370107039036...
please share your .service content too as there are a lot of way to "spawn" in systemd
you can upload the df as artifact.
Or the statistics as a DataFrame and upload as artifact ?
most of the time, "user" would expect that clearml handle the caching by itself
is task.add_requirements("requirements.txt") redundant ?
Is ClearML always look for a requirements.txt in the repo root ?
What error do you have in the Console log tab, in the Web UI ?
Sounds like your docker image is missing some package. This is un-related to clearml.
AS for what package is missing, see here
one the same or different machine !
wow , did not know that vscode have a http "interface" !!! Make kind of sense as vscode is just a Chrome rendering webpage behind the scene ?
please provide the full logs and error message.
by manually updating it like for any app that are on off-line computer ?