Reputation
Badges 1
25 × Eureka!Just making sure, the original code was executed on python 3?
BTW:
This is very odd "~/.clearml/venvs-builds.3/3.6/bin/python" it thinks it is using "python 3.6" but it is linked with python 2.7 ...
No idea how that could happen
HugeArcticwolf77 actually it is more than that, you can embed the graphs now in the markdown, when you hove over any plot/graph/image you now have a new button that copies the embed test, so that you can directly copy it into your markdown editor (internal or external)
More documentation and screenshots are coming after the holiday, mean time you can check:
https://clear.ml/docs/latest/docs/webapp/webapp_reports
https://clear.ml/docs/latest/assets/images/webapp_report-695dddd2ec8064938bf8...
BTW: do notice to install the agent on the system python packages and Not on any venv.
I have one agent running on the machine. I also have only one task running. This
only
happens to us when we use pipelines
@<1724960468822396928:profile|CumbersomeSealion22> notice that when you are launching a pipeline you are actually running Two tasks, one is the "pipeline" itself (i.e. the logic) and one is the component in the pipeline (i.e. the step)
If you have one agent, I'm assuming what happens is the pipeline itself (the one that you launch on your machine)...
Hi LazyFish41
Could it be some permission issue on /home/quetalasj/.clearml/cache/ ?
or me it sounds like the starting of the service is completed but I don't really see if the autoscaler is actually running. Also I don't see any output in the console of the autoscaler.
Do notice the autoscaler code itself needs to run somewhere, by default it will be running on your machine, or on a remote agent,
So sorry for the delay here! totally lost track of this thread
Should I be deploying the entire docker file for every model, with the updated requirements?
It's for every "environment" i.e. if models need the same set of python packages , you canshare
CharmingStarfish14 can you check something from code, just to see if this would solve the issue?
Hi @<1556812486840160256:profile|SuccessfulRaven86>
I'm assuming this relates to the SaaS service.
API calls are away to measure usage, basically metric reports are bunched into a single call, agents pings / query is API call, and so on so forth.
How many hours you had training tasks reporting data? how many agents running and so on
MelancholyChicken65 found it ! thank you for finding this issue.
I'm hoping to get an update soon 🙂
i had a misconception that the conf comes from the machine triggering the pipeline
Sorry, this one :)
How do you currently report images, with the Logger or Tensorboard or Matplotlib ?
Guys I think I lost context here 🙂 what are we talking about? Can I help in anyway ?
okay, let me check it, but I suspect the issue is running over SSH, to overcome these issues with pycharm we have specific plugin to pass the git info to the remote machine. Let me check what we can do here.
FiercePenguin76 BTW, you can do the following to add / update packages on the remote sessionclearml-session --packages "newpackge>x.y" "jupyterlab>6"
EnviousStarfish54
Can you check with the latest clearml from github?pip install git+
If the only issue is this linetask.execute_remotely(..., exit_process=True)It has to finish the static analysis of the entire repository (which usually happens in the background but now we have to wait for it). If the repo is large this could actually take 20sec (depending on CPU/drive of the machine itself)
Hmm I would recommend passing it as an artifact, or returning it's value from the decorated pipeline function. Wdyt?
Wait @<1523701066867150848:profile|JitteryCoyote63>
If you reset the Task you would have lost the artifacts anyhow, how is that different?
cleamrl sdk (i.e. python client)
The issue is that the Task.create did not add the repo, link (again as mentioned above, you need to pass the local folder or repo link to the repo argument of the Task.create function). I "think" it could automatically deduce the repo from the script entry point, but I'm not sure. hence my question on the clearml package version
Oh I see, what you need is to pass '--script script.py' as entry-point and ' --cwd folder' as working dir
Ohh then use the AWS autoscaler, basically it what you want, spin an EC2 and set an agent there, then if the EC2 goes down (for example if this is a spot), it will spin it up again automatically with the running Task on it.
wdyt?
The imports inside the functions are because the function itself becomes a stand-alone job running on a remote machine, not the entire pipeline code. This also automatically picks packages to be installed on the remote machine. Make sense?
What do you mean by "tag" / "sub-tags"?
SmarmySeaurchin8 checks the logs, maybe you can find something there
Hi @<1523702000586330112:profile|FierceHamster54>
I think I'm missing a few details on what is logged, and ref to the git repo?
Now I'm curious what's the workaround ?