
Reputation
Badges 1
25 × Eureka!seems like the server returned 400 error, verify that you are working with your trains-server and not the demoserver :)
The point is, " leap"
is proeperly installed, this is the main issue. And although installed it is missing the ".so" ? what am I missing? what are you doing manually that does Not show in the log?
In other words how did you install it "menually" inside the docker when you mentioned it worked for you when running without the agent ?
I am symlinking the .clearml directory to a NAS server and this is perhaps part of the problem.
Yep, that sounds about right, it uses Posix file system for internal lock mechanisms (multi process locks), and my guess is that the NAS for some reason does not support it...
VirtuousFish83 I can confirm clearml-server 1.3 solves the issue.
I think the ClearmlLogger is kind of deprecated ...
Basically all you need is Task.init at the beginning , the default tensorboard logger will be caught by clearml
understood trains does not have auto versioning
What do you mean auto versioning ?
task name is not unique, task ID is unique, you can have multiple tasks with the same name and you can edit the name post execution
I think this issue was fixed in clearml-server 1.3.0 (released after the weekend),
Let me check
I'm just trying to see what is the default server that is set, and is it responsive
I'm assuming you mean your own server, not the demo server, is that correct ?
and then second part is to check if it is up and alive
Yes, you can curl
to the ping endpoint :
https://clear.ml/docs/latest/docs/references/api/debug#post-debugping
Hi PompousParrot44
What do you have in the Execution/"script path" ?
https://stackoverflow.com/questions/5419/python-unicode-and-the-windows-console
Hmm try to set this one before spinning the agent
Windowsset PYTHONIOENCODING=:replace
Inside Colabos.environ["PYTHONIOENCODING"] = ":replace"
Great!
BTW: you can take some inspiration from here:
https://github.com/allegroai/trains/blob/master/examples/automation/task_piping_example.py
Or from the full pipeline:
https://github.com/allegroai/trains/blob/master/examples/pipeline/pipeline_controller.py
Let say I donβt have the data on my local machine but only S3 bucket.
You can still register it, but make sure you do not delete it from the S3 bucket because it will keep a link to it
Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /
what did you put in output_uri
?
git config --system credential.helper 'store --file /root/.git-credentials'
Maybe we should use this hack for cloning with user/token in general ...
Hi @<1562610699555835904:profile|VirtuousHedgehong97>
I think you need to upgrade your self-hosted clearml-server, could that be the case?
works seamlessly throughout and in our current on premise servers...
I'm assuming via something close to what I suggested above with .netrc ?
is it planned to add a multicursor in the future?
CheerfulGorilla72 can you expand? what do you mean by multicursor ?
Follow-up; any ideas how to avoid PEP 517 with the auto scaler?
Takes a
long
time to build the wheels
enable venv caching ?
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L116
I just set the git credentials in the
clearml.conf
and it works out of the box
git has issues with passing the user/token from the main repo to the submodules, hence my surprise that it is working out-of-the-box.
Do notice that if you are ussing ssh-key this is a none issue.
Nope, no
.netrc
defined anywhere, ...
If this is the case can you try to add the following to your "extra_vm_bash_script"
` echo machine example.com > ~/.netrc && echo log...
/home/npuser/.clearml/venvs-builds/3.7/task_repository/commons-imagery-models-py
Yep I see it now, could you simulate locally (i.e have the other folders in the path as well)?
could it be you also have a file somewhere that is called sfi or imagery or models or chip_classifier that it accidently tries to import first from ?
Here, I
know
the pattern is incomplete and invalid. A less advanced user might not understand what's up.
Basically like your suggestion that if the request fails while typing instead of the error popup the search bar will turn "dark red", and on the next key stroke will be "cleaned" ?
hmm this might help:
https://pip.pypa.io/en/stable/topics/configuration/#environment-variables
basically you might be able to define:PIP_NO_USE_PEP517=1
Hi RoughTiger69
I like the direction this is taking, let me add some more complexity.
My thinking is that if we have βinput datasetsβ, I'd also like to be able to clone the Task and automagically change them (with the need to export the dataset_id as an argument), basically I'm thinking :train = Datasset.get('aabbcc1', name='train') valid = Datasset.get('aabbcc2', name='validation') custom = Datasset.get('aabbcc3', name='custom')
Then you end up with HyperParameter Section: "Input Datas...
PYTHONPATH is still not working as expected
inside your code if you do :import os print("PYTHONPATH", os.environ["PYTHONPATH"])
what are you getting?
BoredHedgehog47
is this ( https://clearml.slack.com/archives/CTK20V944/p1665426268897429?thread_ts=1665422655.799449&cid=CTK20V944 ) the same issue (or solution) ?
We are here if you need further help π
Hi ExcitedFish86
Good question, how do you "connect" the 3 nodes? (i.e. what the framework you are using)
Epochs are still round numbers ...
Multiply by 2?! π
BeefyCow3 if you are trying to optimizer a specific metric (i.e. a scalar on a graph). The template Task should report it with the same title/series combination, which should be easy enough to verify in the UI π
You can either report with Tensorboard or with the Trains Logger, either way will work.