Reputation
Badges 1
25 × Eureka!I'm just trying to see what is the default server that is set, and is it responsive
I'm assuming you mean your own server, not the demo server, is that correct ?
and then second part is to check if it is up and alive
Yes, you can curl
to the ping endpoint :
https://clear.ml/docs/latest/docs/references/api/debug#post-debugping
HealthyStarfish45 could you take a look at the code, see if it makes sense to you?
What I'm getting to, is maybe we build a template, then you could fill in the gaps ?
In the docker bash startup scriptapt-get install poppler-utils
Exactly, thatβs my problem: I want to remove it to make sure it is reinstalled (because the version can change)
JitteryCoyote63 yes, this is definitely a pip bug... can you test with the latest pip version, maybe it was fixed? (i.e. git+https:// link)
BTW
/home/local/user/.clearml/venvs-builds/3.7/bin/python: can't open file 'train.py': [Errno 2] No such file or directory
This error is from the agent, correct? it seems it did not clone the correct code, is train.py
committed to the repository ?
I am creating this user
Please explain, I think this is the culprit ...
I have to admit, I haven't had the time π
Trying to get pip to be twice as fast π€
https://github.com/pypa/pip/pull/8215
Please keep pinging me, I would really like to follow on it.
and when you remove the "." line does it work?
And is this repo installed on the pipeline creating machine ?
Basically I'm asking how come it did not automatically detect it?
Hi GrievingTurkey78 ,
Yes this is a per file download, but I think you can list the bucket and download everything
Try:from trains import StorageManager from trains.storage.helper import StorageHelper helper = StorageHelper.get('gs://bucket/folder') remote_files = helper.list('*') for f in remote_files: StorageManager.get_local_copy(f)
they are just neighboring modules to the function I am importing.
So I think that is you specify the repo,, on the remote machine you will end with the code of the component sitting at the root folder of the repo, from there I assume you can import the rest, the root git path should be part of your PYTHONPATH automatically.
wdyt?
is removed from the experiment list?
You mean archived ?
do you have docker installed on all slurm agent/worker machines
Docker support?
Okay ConfusedPig65 I found the problem. For some reason the latest TF.keras.load_model . save_model is not tracked.
I'll make sure we push a fix later today
PompousBeetle71 quick question, will you ever want to pass an empty string ? reason for asking is that it is either one or the other, there is no way for Trains to actually differentiate (from the web UI, perspective this is just an empty string field...)
TenseOstrich47 every agent instance has its own venv copy. Obviously every new experiment will remove the old venv and create a new one. Make sense?
Yes, the same will work with artifacts, use pass the full url to the artifact_object
it should just register it as is.
I mean manually you can get the results and rescale but, not through the UI
GiganticTurtle0 in the PipelineDecorator.component
, did you pass helper_functions=[]
with refrence to all the sub component ?
This one seem to work
` from clearml import Task
task = Task.init(...)
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (0.75, 1.00, 0.75), (200, 3))
plot:
fig, ax = plt.subplots()
vp = ax.violinplot(D, [2, 4, 6], widths=2,
showmeans=False, showmedians=False, showextrema=False)
styling:
for body in vp['bodies']:
body.set_alpha(0.9)
ax.set(xlim=(0, 8), xticks=np.arang...
Good point!
I'll make sure we do π
make sure you follow all the steps :
https://clear.ml/docs/latest/docs/deploying_clearml/upgrade_server_linux_mac
(basically make sure you get the latest docker-compose.yml and the pull itcurl
-o /opt/clearml/docker-compose.yml docker-compose -f /opt/clearml/docker-compose.yml pull docker-compose -f /opt/clearml/docker-compose.yml up -d
can we also put the path to the CA?
Yes :)
I know that there is possibility to set up some budget - for example seconds of running after which optimization stops. But is there a possibility to specify a boolean condition when work should stop?
RoundMosquito25 you mean when you reach a limit of loss<Threshold
or something similar ?
I assume the task is being launched sequentially. I'm going to prepare a more elaborate example to see what happens.
Let me know if you can produce a mock test, I would love to make sure we support the use case, this is a great example of using pipeline logic π
How do you run theΒ
clearml-agent
Β in docker mode
clearml-agent --docker
See here:
https://clear.ml/docs/latest/docs/clearml_agent#docker-mode
Shouldn't this be a real value and not a template
you mean value being pulled to the pod that failed ?
Requested version: 2.28, Used version 1.0" for some reason
This is fine that means there is no change in that API
Hmm let me check something