Reputation
Badges 1
25 × Eureka!I had no idea it was going to do that and sent your servers over 1.4M API hits unintentionally
Yeah, that is way too much, I think relates to the frequency it updates the console π
Hi @<1523701323046850560:profile|OutrageousSheep60>
What do you mean by "in clearml server" ? I do not see any reason a subprocess call from a Task will be an issue. What am I missing ?
I think I found something relating to the issue on the subprocess not logging. Let me check if we can share something quickly
SteepDeer88
Try the following:
` Task.add_requirements("pycocotools-windows", "; platform_system == "Windows"")
Task.add_requirements("pycocotools", "; platform_system != "Windows"")
Task.init(...) You should see in your "installed packages" something like:
pycocotools-windows ; platform_system == "Windows"
pycocotools ; platform_system != "Windows" `
agree, but setting the agentβs env variable TMPDIR
I think this needs to be passed to the docker with -e TMPDIR=/new/tmp
as additional container args:
see example
None
wdyt?
Found the issue, fix in the next RC (soon to be out)
Hi VivaciousWalrus21
After restarting training huge gaps appear in iteration axis (see the screenshot).
The Task.init
actually tries to understand what was the last reported interation and continue from that iteration, I'm assuming that what happens is that your code does that also, which creates a "double shift" that you see as the jump. I think the next version will try to be "smarter" about it, and detect this double gap.
In the meantime, you can do:
` task = Task.init(...)...
Expected behaviour is that it reads last iteration correctly. At least it is stated in docs so.
This is exactly what should happen, are you saying that for some reason it fails?
Hi VivaciousWalrus21 I tested the sample code, and the gap was evident in Tensorboard as well. This is not clearml generating this jump this is internal (like the auto de/serialization and continue of the code base)
I'm assuming these are the Only packages that are imported directly (i.e. pandas requires other packages but the code imports pandas so this is what listed).
The way ClearML detect packages, it first tries to understand if this is a "standalone" scrip, if it does, than only imports in the main script are logged. Then if it "thinks" this is not a standalone script, then it will analyze the entire repository.
make sense ?
WackyRabbit7 I might be missing something here, but the pipeline itself should be launched on the "pipelines" queue, is the pipeline itself running? or is it the step itself that is stuck in ""queued" state?
- Components anyway need to be available when you define the pipeline controller/decorator, i.e. same codebaseNo you an specify a different code base, see here:
None - The component code still needs to be self-composed (or, function component can also be quite complex)Well it can address the additional repo (it will be automatically added to the PYTHONPATH), and you c...
I am writing quite a bit of documentation on the topic of pipelines. I am happy to share the article here, once my questions are answered and we can make a pull request for the official documentation out of it.
Amazing please share once done, I will make sure we merge it into the docs!
Does this mean that within component or add_function_step I cannot use any code of my current directories code base, only code from external packages that are imported - unless I add my code with ...
Hi ScantChimpanzee51
How are you launching the code ?
Basically the easiest way is to do so with the example you just mentioned,
Can this issue be reproduced ?
BTW: the cloning error is actually the wrong branch, if you take a look at your initial screenshot, you can see the line before last branch='default'
which I assume should be branch='master'
(The error itself is still weird, but I assume that this is what git is returning)
Hurray π
BTW: the next version will have a project level "readme alike" markdown embedded in the UI, so hopefully you will be able to add all the graphs there :)
Anyhow if the StorageManager.upload was fast, the upload_artifact is calling that exact function. So I don't think we actually have an issue here. What do you think?
For example:agent.docker_preprocess_bash_script = [ "echo 'Binary::apt::APT::Keep-Downloaded-Packages \"true\";' > /etc/apt/apt.conf.d/docker-clean", "apt-get update", "apt-get install -y wget", "echo \"we have wget\"", ]
Yes EnviousStarfish54 the comparison is line by line and compared only to the left experiment (like any multi comparison, you have to set the baseline, which is always the left column here, do notice you can reorder the columns and the comparison will be updated)
DS, this way they only need to remember (and me only need to teach them where to find) one id.
Yes that's the point, this ID is the Model UID (as opposed to the Task ID), the reason I kind if "insist" on it is that the Model ID is built into the system meaning, this is how you register it, as opposed to the Task ID that somehow needs to be hacked/passed externally
TBH the main reason I went with our API is that because of the custom model loading, we need to use the "custom" framew...
HealthyStarfish45
No, it should work π
Seems like credentials error
Do you have everything setup correctly in your ~/clearml.conf
?
No worries π glad to hear it worked out π
PompousBeetle71 just making sure, and changing the name solved it?
Hi DepressedChimpanzee34 , took me a while but I think there is a solution:
In your docker file, replace:
https://github.com/allegroai/clearml-server/blob/a64c4d264d00eadd2d11818b37151d3cc6266d99/docker/docker-compose.yml#L5
withentrypoint: /bin/bash command: -c "mkdir -p /var/log/clearml && cd /opt/clearml/ && python3 -m apiserver.apierrors_generator && gunicorn -w 4 -t 600 --bind=0.0.0.0:8008 apiserver.server:app"
EnviousStarfish54
it seems that if I don't use plt.show() it won't show up in Allegro, is this a must?
Yes , at plt.show / plt.save Trains will capture the plot and send it to the backend.
BTW: when you hover over the empty plot area, do you see the plotly objects, or is it all blank ?
What's the error you are getting ?
CheerfulGorilla72 could it be the server address has changed when migrating ?