DS, this way they only need to remember (and me only need to teach them where to find) one id.
Yes that's the point, this ID is the Model UID (as opposed to the Task ID), the reason I kind if "insist" on it is that the Model ID is built into the system meaning, this is how you register it, as opposed to the Task ID that somehow needs to be hacked/passed externally
TBH the main reason I went with our API is that because of the custom model loading, we need to use the "custom" framew...
Okay yes, that's exactly the reason!! Cross origin blocks the file link
SmarmySeaurchin8
When running in "dev" mode (i.e. writing the code) only packages imported directly are registered under "installed packages" , then when the agent is executing the experiment, it will update back the entire environment (including derivative packages etc.)
That said you can set detect_with_pip_freeze
to true (in trains.conf) and it will basically store the entire pip freeze.
https://github.com/allegroai/trains/blob/f8ba0495fb3af1f99732fdffbbccd2fa992934a4/docs/trains.c...
About .get_local_copy... would that then work in the agent though?
Yes it would work both locally (i.e. without agent) and remotely
Because I understand that there might not be a local copy in the Agent?
If the file does not exist locally it will be downloaded and cached for you
ConvolutedSealion94 try scikit
not scikitlearn
I think we should add a warning if a key is there and is being ignored... let me make sure of that
I just disabled all of them with
auto_connect_frameworks=False
Yep that also works
Btw it seems the docker runs in
network=host
Yes, this is so if you have multiple agents running on the same machine they can find a new open port π
I can telnet the port from my mac:
Okay this seems like it is working
CleanPigeon16 , just making sure, docker is installed and configured on the host machine (i.e. Azure machine)?
oh...so is this a bug?
It was always a bug, only an elusive one π
Anyhow, I'll make sure we push a fix to GitHub, an RC is planned for later this week, it will contain it
EnviousStarfish54 generally speaking the hyper parameters are flat key/value pairs. you can have as many sections as you like, but inside each section, key/value pairs. If you pass a nested dict, it will be stored as path/to/key:value (as you witnessed).
If you need to store a more complicated configuration dict (nesting, lists etc), use the connect_configuration, it will convert your dict to text (in HOCON format) and store that.
In both cases you can edit the configuration and then when ru...
BoredGoat1 where exactly do you think that happens ?
https://github.com/allegroai/trains/blob/master/trains/utilities/gpu/gpustat.py#L316
?
https://github.com/allegroai/trains/blob/master/trains/utilities/gpu/gpustat.py#L202
yeah. I am getting logs, but they are extremely puzzling to me. I would appreciate to actually have access to whole package structure..
Actual packages are updated back to "Installed Packages" section (under the execution tab).
indeed. can you maybe point where the docker command is composed.
https://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/clearml_agent/commands/worker.py#L3694
π
BTW: you can run/build the entire thing on your machin...
i'm Jax, not Manoj! lol.
I know π I just mentioned that this issue is being actively discussed
Okay, I was able to reproduce it (this is odd) let me check ...
JitteryCoyote63 see if upgrading the packages as they suggest somehow fixes it.
I have the feeling this is the same problem (the first error might be trains masking the original error)
Hi JitteryCoyote63
I think this is the default python str() casting.
But you can specify the preview test when you call upload_artifact:
https://clear.ml/docs/latest/docs/references/sdk/task#upload_artifact
see preview
argument
Hi PanickyFish98
It verifies it has access to it when actually creating the Task, maybe it should be a warning?!
fyi: you can also change the value from the UI (under Execution output) or have a default one set in the clearml.conf
used by the agent
A more detailed instructions:
https://github.com/allegroai/trains-agent#installing-the-trains-agent
Notice the configuration parameters:
https://github.com/allegroai/clearml/blob/34c41cfc8c3419e06cd4ac954e4b23034667c4d9/examples/services/monitoring/slack_alerts.py#L160
https://github.com/allegroai/clearml/blob/34c41cfc8c3419e06cd4ac954e4b23034667c4d9/examples/services/monitoring/slack_alerts.py#L162
https://github.com/allegroai/clearml/blob/34c41cfc8c3419e06cd4ac954e4b23034667c4d9/examples/services/monitoring/slack_alerts.py#L156
Hi DefeatedCrab47
You mean by trains-agent, or accumulated over all experiences ?
Hmm maybe we should add a test once the download is done, comparing the expected file size and the actual file size, and if they are different we should redownload ?
This is what I just used:
` import os
from argparse import ArgumentParser
from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Activation, Dense, Softmax
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint
from clearml import Task
parser = ArgumentParser()
parser.add_argument('--output-uri', type=str, required=False)
args =...
if the file is untracked by git, it is not saved by clearml
Yep π
Does clearml-agent install the repo withΒ
pip install -e .
It is supported, but the path to the repo cannot be absolute (as it will probably be something else in the agent env)
You can add "git+ https://github.com ...." to the "installed packages" The root path of your repository is always added to the PYTHONPATH when the agents executes it, so in theory there is no need to install it wi...
Hmm that is odd, let me see if I can reproduce it.
What's the clearml version you are using ?
Also there was a truck that worked in the previous big, could you zoom out in the browser, and see if you suddenly get the plot?
I called task.wait_for_status() to make sure the task is done
This is the issue, I will make sure wait_for_status() calls reload at the ends, so when the function returns you have the updated object