Reputation
Badges 1
53 × Eureka!✦2 ❯ git remote show
github
Unfortunately it still happens 😞 :
` Epoch 51: 100%|███████████████████████████████████████████████████████████| 361/361 [02:52<00:00, 2.10it/s, loss=0.169, v_num=9-29]
2021-09-17 09:58:22,253 - clearml.Task - INFO - Waiting for repository detection and full package requirement analysis
2021-09-17 10:03:22,254 - clearml.Task - INFO - Repository and package analysis timed out (300.0 sec), giving up
2021-09-17 10:03:22,313 - clearml.Task - WARNING - Failed auto-det...
` radu on vm-aimbrain-01 in experiments/runners/all via 🐍 v3.8.5 via C volt
❯ git ls-remote --get-url github
github
radu on vm-aimbrain-01 in experiments/runners/all via 🐍 v3.8.5 via C volt
❯ git ls-remote --get-url
fatal: No remote configured to list refs from.
radu on vm-aimbrain-01 in experiments/runners/all via 🐍 v3.8.5 via C volt
❯ git --version
git version 2.17.1 `
Sorry to ping you @<1523701087100473344:profile|SuccessfulKoala55> , can you offer any ideas to the two questions from my reply (about the correct web app cloud access and the correct way to specify a blob storage in the clearml.conf
file? Thanks 🙏
Must be something else foul at play here..
you think that simply adding origin will fix this - I really don't mind doing that
I'll let you know asap
Hi AgitatedDove14 , I deleted everything in /opt/clearml as per the docs. Should I delete anything else?
OK I won't edit the db 😄 . Thanks for the suggestion, we'll use that!
I forgot to say I've set up a local server - we are still testing phase. I've created credentials for them because they couldn't generate them for themselves (they did clearml-init, and have eacha clearml.conf file but the ADD CRENDENTIALS part didn't show up for them).
Unfortunately there doesn't seem to be any out-of-the-box functionality for ridgeline plots (joyplots) in plotly. They are certainly doable ( https://www.python-graph-gallery.com/ridgeline-graph-plotly , or https://chart-studio.plotly.com/~empet/14632/plotly-joyplotridgelines/#/ ) but I'd guess this won't happen any time soon 🤭 . We'd be happy with also having functionality similar to the one from the Scalars tab: first isolating one iteration (the latest by default) and grouping togeth...
Hi @<1523701087100473344:profile|SuccessfulKoala55> ,
thanks for the pointers.
I didn't know that the plot data is stored in elasticsearch. Good to know. It relates to the rest of my questions in that I want to understand where everything is saved, all the parts of my experiments. The plots are actually the most important part, since I have direct access to the artifacts I save (like, say, models) but not to the plot data which helps me compare and rank experiments. I mention tensorboard be...
This is how the links to the artifacts looks like (the part I blurred out is is the last part of the secret, which is working fine since the task was able to upload those correctly to storage, I can check that):
` # Development mode worker
worker {
# Status report period in seconds
report_period_sec: 2
# ping to the server - check connectivity
ping_period_sec: 30
# Log all stdout & stderr
log_stdout: true
# Carriage return (\r) support. If zero (0) \r treated as \n and flushed to backend
# Carriage return flush support in seconds, flush consecutive line feeds (\r) every X (default: 10) s...
radu on vm-aimbrain-01 in experiments/runners/all via :snake: v3.8.5 via C volt ❯ grep flush ~/clearml.conf # Carriage return (\r) support. If zero (0) \r treated as \n and flushed to backend # Carriage return flush support in seconds, flush consecutive line feeds (\r) every X (default: 10) seconds console_cr_flush_period: 600
That works fine:1631895370729 vm-aimbrain-01 info ClearML Task: created new task id=cfed3ea8512d4d9f858d085bd79e62e8 2021-09-17 16:16:10,744 - clearml.Task - INFO - No repository found, storing script code instead ClearML results page:
`
1631895370892 vm-aimbrain-01 info start
1631895370896 vm-aimbrain-01 error 0%| | 0/100 [00:00<?, ?it/s]
1631895471026 vm-aimbrain-01 error 100%|████...
I don't control tqdm, (otherwise I would have already gone for Stef's suggestion) - pytorch-lightning does in this particular script 😞 .
I found out that the lightning trainer has a progress_bar_refresh_rate
argument (default set to 1) which produces the spamming logs. If I set that to 10, I get 1/10th of the spam (but a janky progress bar in the console). I could set it to 0 to disable it, but that's not really a fix. What I'd really want is the same behaviour in the console (one smooth progress bar) and one line per epoch in the logs; high hopes, right? 😊
Also I just tried the pytorch-lightning RichProgressBar
(not yet released) instead of the default (which is unfortunately based on tqdm) and it works great.
Interesting, I don't get newlines in any of my consoles:ClearML Task: overwriting (reusing) task id=38cc10401fcc43cfa432b7ceed7df0cc 2021-10-08 14:57:53,704 - clearml.Task - INFO - No repository found, storing script code instead ClearML results page:
`
...
The UI shows the log as is (and as pasted above). In the console I'm getting correct output (a single tqdm progress line):
` [2021-09-17 13:29:51,860][pytorch_lightning.utilities.distributed][INFO] - GPU available: True, used: True
[2021-09-17 13:29:51,862][pytorch_lightning.utilities.distributed][INFO] - TPU available: False, using: 0 TPU cores
[2021-09-17 13:29:51,862][pytorch_lightning.utilities.distributed][INFO] - IPU available: False, using: 0 IPUs
[2021-09-17 13:29:51,866][pytorch_ligh...
Hi Jake, thanks for the reply. I've tried the account key method, works fine - but unfortunately clearml expects an old version of azure-storage-blob
(<2.1), which is incompatible with the recent versions (^12.). Any clues of how we could work around this one? Thanks again.
This is adapted from one of the methods in their ProgressBar
classfrom tqdm import tqdm bar = tqdm( desc="Training", initial=1, position=1, disable=False, leave=False, dynamic_ncols=True, file=sys.stderr, smoothing=0) with bar: for i in range(10): time.sleep(0.1) bar.update() print('done')
In the console this works as expected, but in a jupyter notebook this produces a scrolling log (because of the position=1 argument, which happens whenever the bar is not th...
In case anyone is interested, the minimum effort workaround I found is to edit pytorch_lightning/callbacks/progress.py
and change all occurrences of dynamic_ncols=True
to dynamic_ncols=False
in the calls to tqdm
. One could of course implement a custom callback inheriting from their ProgressBar
class.