Notice that in your example you have
plt.figure()
This actually clears the matplotlib figure, this is why we are getting a first white image then the actual plot,
once I removed it I got a single plot (no need for the manual reporting)
X, y = make_regression(
n_samples=100, # Number of samples
n_features=10, # Number of features
noise=0.1, # Number of informative features
random_state=42 # For reproducibility
)
# Convert to DataFrame for better f...
add_external_files
with a very large number of urls that are
not
in the same S3 folder without running into a usage limit due to the
state.json
file being updated
a lot
?
Hi ShortElephant92
what do you mean the state.json is updated a lot?
I think that everytime you call add_external_files
is updated, but
add_external_files ` can get a folder to scan, that would be more efficient. How are you using it ?
Nice! TrickySheep9 any chance you can share them ?
Yeah, but I still need to update the links in the clearml server
yes... how many are we talking about here?
You might be able to write a script to override the links ... wdyt?
ClearML does not work easily with Google Drive.
Yes, google drive is not google storage (which ClearML supports 🙂 )
Seems like you solved it?
SmilingFrog76
there is no internal scheduler in Trains
So obviously there is a scheduler built into Trains, this is the queues (order / priority)
What is missing from it is multi node connection, e.g. I need two agents running the exact same job working together.
(as opposed to, I have two jobs, execute them separately when a resource is available)
Actually my suggestion was to add a SLURM integration, like we did with k8s (I'm not suggesting Kubernetes as a solution for you, the op...
Hmm if this is case, you can add some prints in here:
None
the service/action will tell you what you are sending
wdyt?
It should actually work the same, if you find out it fails to properly register let me know (and then I guess a github issue is the next step)
Hi @<1559711593736966144:profile|SoggyCow20>
I would first like to say how amazing clearml is!
Thank you! 🙏
Running in Docker mode (v19.03 and above) - using default docker image: nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04
yes sdk.agent.default_docker.image = python:3.10.0-alpine
should beagent.default_docker.image = python:3.10.0-alpine
Notice the scope is agent, not sdk
Oh i get it now, can you test:git ls-remote --get-url github
and thengit ls-remote --get-url
AttributeError: 'PosixPath' object has no attribute 'loc'
SarcasticSquirrel56 I'm assuming the artifacts is pandas and you forgot to either import before or add as requirement for the Task 🙂
This is causing the artifact .get()
method to revert to returning the local path to the artifact, instead of actually de-serializing
(We should print a warning though, I'll make sure we do 🙂 )
EDIT: basically clearml failed to realize you also need pandas because it was never imported ...
Hi RoughTiger69
seems to not take the pacakges that are in the requirements.txt
The reason for not taking the entire python packages, it will most likely break when trying to run inside the agent.
The directly imported packages aill essentially pull their required packages, and thus create a stable env on the remote machine. The agent then will store the Entire env, as it assumes it will be able to fully replicate it the next time it runs.
If the "Installed Packages" section is empty...
Hi GiganticTurtle0
you should actually get " file://home/user/local_storage_path "
With "file://" prefix.
We always store the file:// prefix to note that this is a local path
This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?
Correct 🙂
Hi CourageousDove78
Not the cleanest, but you can basically pass everything here:
https://allegro.ai/clearml/docs/rst/references/clearml_api_ref/index.html#post--tasks.get_all
Reasoning is that it is passed almost as is to the server for the actual query.
Hi DashingHedgehong5
Is the text the ,labels on the histogram bucket ?
Notice the xlabels
arguments, id this what you are looking for ?
I... did not, ashamed to admit.
UnevenDolphin73 😄 I actually think you are correct, meaning I "think" that what you are asking is the low level logging (for example debug that usually is not printed to console) to also log? is that correct ?
I suppose one way to perform this is with a
that kicks
Yes, that was my thinking.
It seems more efficient to support a triggered response to task fail.
Not sure I follow this one, I mean the pipeline logic itself monitors the execution. If I'm not mistaken, try/except will catch a step that files, and a global will catch the entire pipeline. Am I missing something ?
Hi SkinnyPanda43
Can you attache the full log?
Clearml agent is installed before your requirements.txt , at least in theory it should not collide
There may be cases where failure occurs before my code starts to run (and, perhaps, after it completes)
Yes that makes sense, especially from IT failure perspective
DepressedChimpanzee34
so parsing bask is done via a yaml reader:
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/args.py#L506
We could add extra test here, checking for \ in the string, that should solve it and will be backwards compatible (I think)
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/task.py#L935
Hi @<1641611252780240896:profile|SilkyFlamingo57>
. It is not taking a new pull from Git repository.
When you are saying it's not trying to get the latest, are you referring to a new run of the pipeline, and then the component being pulled is Not pulling the latest from the branch, is that the issue?
When you click on the component Task details (i.e. right hand side panel "Full details"), what's the commit ID you have?
Lastly, is the component running on the same machine as the prev...
cuda 10.1, I guess this is because no wheel exists for torch==1.3.1 and cuda 11.0
Correct
how can I enforce a specific wheel to be installed?
You mean like specific CUDA wheel ?
you can simple put the http link to the wheel in the "installed packages", it should work
3.a
Regarding the model query, sure from Python or restapi you can query based on any metadata
https://clear.ml/docs/latest/docs/references/sdk/model_model/#modelquery_modelsmodels
3.b
If you are using clearml-serving then check the docs / readme, but in a nutshell yes you can.
If the inference code is batchprocessing, which means a Task, then of course you can and lauch it, check the clearml agent f...
Hi @<1600661423610925056:profile|StrongMouse81>
using serving base url and also other endpoint of model we add using:
clearml-serving model add
we get the attached respond:
And other model endpoints are working for you?
Local changes are applied before installing requirements, right?
correct
tell me please, does the agent always create a virtual environment?
Yes, but it inherits from the container preinstalled system environment
is it possible to make the agent run the script in an already prepared docker container without creating a virtual environment in the container?
You can set the CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1
environment variable