
Reputation
Badges 1
103 × Eureka!ah, my mistake, that’s an issue in my conf file.
but there was a pip_version: “<20.2” line in my
clearml.conf` , which would possibly have been a default in the config file like, 2 years ago or something
the issue also may have been fixed somewhere between 20.1 and 22.2, i didn’t test versions in between those two
(also, the training code, which uses pandas, worked)
okay, looks like my main issue was the errant plt.show
( 😩 ). report_media
works fine without specifying set_default_upload_destination
when that’s been removed 😅
great, thank you very much for this info!
$ pip freeze | grep pandas geopandas @ file:///home/conda/feedstock_root/build_artifacts/geopandas_1623249625470/work pandas==1.3.3
don’t want to pester, but i am curious—did they have some thoughts on what was happening? should i make a feature request somewhere?
$ conda list | grep matplotlib matplotlib 3.4.3 py39hf3d152e_1 conda-forge matplotlib-base 3.4.3 py39h2fa2bec_1 conda-forge
wondering if there has been an update on this?
should be posted in the “uncommitted changes” section 🙂
actually yes— task.init
is called inside of a class in one of the internal imports
running my own clearml
server with a vanilla config (obtained from github), except i have one fixed user
and it’s in the “installed packages” from the child task:
` absl-py==0.14.0
aiohttp==3.7.4.post0
async-timeout==3.0.1
attrs==21.2.0
cachetools==4.2.2
certifi==2021.5.30
chardet==4.0.0
charset-normalizer==2.0.6
clearml==1.1.1
cycler==0.10.0
Cython==0.29.24
fsspec==2021.9.0
furl==2.1.2
future==0.18.2
google-auth==1.35.0
google-auth-oauthlib==0.4.6
grpcio==1.40.0
idna==3.2
joblib==1.0.1
jsonschema==3.2.0
kiwisolver==1.3.2
Markdown==3.3.4
matplotlib==3.4.3
multidict==5.1.0
numpy==1.21.2
oauthlib=...
2023-05-06 12:05:49,168 - clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###
lol.
this changes the status in the UI to “aborted”.
not ideal, but if the answer is “for this to work, tasks must be run by an agent” i accept it
thanks for that tip. i cleared out the vcs cache and was already using the latest version of the agent, same problem persists.
there’s a python version mismatch, i will make a different env for the agent to run in that has a matching python version
that must have been it. here’s the installed packages when not using -m
:
` # Python 3.9.7 | packaged by conda-forge | (default, Sep 23 2021, 07:28:37) [GCC 9.4.0]
Local modules found - skipping:
modulename == ../pathto/modulename/init.py
PyYAML == 5.4.1
Shapely == 1.7.1
clearml == 1.1.1
click == 7.1.2
matplotlib == 3.4.3
numpy == 1.21.2
pandas == 1.3.3
python_dateutil == 2.8.2
pytorch_lightning == 1.4.8
pytz == 2021.1
rasterio == 1.2.8
scikit_image == 0.18.3
scikit_learn == 0...
getting different issues (torchvision vs. cuda compatibility, will work on that), but i’m betting that was the issue
okay, i have a few things on my todo list, they will take a while. we will task.init
in the entry point instead of how it’s done now, and we will re-try python -m
. if it doesn’t work, we will file an issue. if it does work, yay!
either way, thanks much for your help today, i really appreciate it.
but hmm, report_media
generates a file that is 0 bytes, whereas report_image
generates a 33KB file
in the main script, these are the first imports:import argparse import time import json import pytorch_lightning as pl from pytorch_lightning.accelerators import accelerator
then after that we import stuff from the repo, and the listed packages are imported in those files
correct, i’m just running the task via CLI
we do use all those packages, and the version numbers are correct
good luck! thanks for looking into it 🙂
if you’re able to check the data store, folders for all 120 plots will be on disk.
okay, so if i set set_default_upload_destination
as URI that’s local to the computer running the task (and the server):
- the server is “unable to load the image”—not surprising because the filesystem URI was not mounted into the container
- the files are present at the expected location on the local filesystem, but they are…blank! all white.that tells me that
report_media
might have been successful, but there’s some issue …encoding the data to a jpeg?