Oh this is so internally, the background thread can signal it is not deferred, are you saying there is bug or the code is odd?
Did you run clearml-init
after the pip install ?
Hi @<1562973095227035648:profile|ThoughtfulOctopus83>
The host should be just the host name, no https prefix, I'm assuming that's the issue
DeliciousBluewhale87
You could also just upload the data (i.e do not call close). Then you will be able to change it later obviously, this will make in intractable.
BTW: the clearml-data stores delta changes, so if you only change a few files it will only store those.
EnviousPanda91
in your clearml.conf I think you are missing a sectionagent.git_user="" agent.git_pass="" agent.git_host="" agent.force_git_ssh_protocol: true
Hi MortifiedDove27
Looks like there is a limit of 100 images per experiment,
The limit is 100 unique combination of title/series per image.
This means that changing the title or the series name will add 100 more images (notice the 100 limit is for previous iterations)
I think it should look something like:files { gsc { contents: """{"type": "service_account", "project_id": "ai-platform", "private_key_id": "9999", "private_key": "-----BEGIN PRIVATE KEY-----==\n-----END PRIVATE KEY-----\n", "client_email": "a@ai.iam.gserviceaccount.com", "client_id": "111", "auth_uri": "
", "token_uri": "
", "auth_provider_x509_cert_url": "
", "client_x509_cert_url": "
"}""" path: "~/gs.cred" } }
. Perhaps it is the imports at the start of the script only being assigned to the first task that is created?
Correct!
owever when I split the experiment task out completely it seems to have built the cloned task correctly.
Nice!!
Okay, this is odd the request returned exactly 100 out 100.
It seems not all of them were reported?!
Could you post the toy code, I'll check what's going on.
Hi TenseOstrich47
Does the .ssh folder on the user running the agent contain the correct credentials ?
Basically from the user running the agent on the agent's machine can you clone the repo with:ssh://git@github.com/15gifts/py-db.git
Yes, I do have my files in the git repo. Although I have not quite understood which part it takes from the remote git repo, and which part it takes from my local system.
it will do "git pull" on the remote machine and then apply any uncommitted changes it has stored in the Task
It seems that one also needs to explicitly hand in the git repo in the pipeline and task definitions via PipelineController,
Correct, unless the pipeline logic and the steps are the same git repo, you can...
Hi @<1724960468822396928:profile|CumbersomeSealion22>
As soon as I refactor my project into multiple folders, where on top-level I put my pipeline file, and keep my tasks in a subfolder, the clearml agent seems to have problems:
Notice that you need to specify the git repo for each component. If you have a process (step) with more than a single file, you have to have those files inside a git repository, otherwise the agent will not be able to bring them to the remote machine
It's in my local conda environment though.
Meaning this is a wheel installed manually in conda? or is it a folder inside the conda environment ?
I have a client that runs clearml-session and i saw from the agent's logs that the installation of vscode fails.
That makes sense, it downloads the vscode in runtime, do you have an alternative location? or maybe it is easier to built a container with the vscode pre installed ?
Hi ShallowArcticwolf27
However, the AMI for version 0.16.1 has the following docker-compose file
I think we moved the docker-compose yaml when we upgraded from trains to clearml. Any reason your are installing the old docker-compose ?
Hi PanickyMoth78
So do not tell anyone, but the next version will have reports built in clearml, as well as the ability to embed graphs in 3rd party (think Notion GitHub, markdown etc.)
Until then (ETA mid Dec), the easiest is to download an image or just use the url (it encodes the full view, so when someone clicks on it they the exact view you are seeing)
Hi CrookedWalrus33
the python version is auto detected and register in "manual execution" time (i.e. when you run your code on your machine).
That said this is a suggestion for the agent, and only if it can actually find the matching Python version it will use it, otherwise it will use whatever is
available (i.e. Look through the PATH environment for a matching pythonX.Y
executable)
The easiest way to support would just make sure the python binary's path is added to the PATH env.
Does...
We suddenly have a need to setup our logging after every
task.close()
Hmm that gives me a handle on things, any chance it is easily reproducible ?
Hi SmugSnake6
I think it was just fixed, let me check if the latest RC includes the fix
PleasantGiraffe85 you can disable the SSL verification on the client end:
https://github.com/allegroai/clearml-agent/blob/21c4857795e6392a848b296ceb5480aca5f98e4b/docs/clearml.conf#L12
Basically you can just manually create the clearml.comf
with only the following:api { api_server:
web_server:
files_server:
`
credentials {"access_key": "EGRTCO8JMSIGI6S39GTP43NFWXDQOW", "secret_key": "x!XTov_G-#vspE*Y(h$Anm&DIc5Ou-F)jsl$PdOyj5wG1&E!Z8"}
# verify...
GreasyPenguin14 whats the clearml version you are using, OS & Python ?
Notice this happens on the "connect_configuration" that seems to be called after the Task was closed, could that be the case ?
Hi TrickyRaccoon92
If you are reporting to tensor-board, then "iteration" equals step. Is this the case?
I don't have the compose file, or at least can't seem to find it inΒ
/opt
you can manually take down all dockers with:docker ps
then docker stop <container id>
for each container id
Hi MinuteWalrus85
This is great question, and super important when training models. This is why we designed a whole system to manage datasets (including storage querying, balancing data, and caching). Unfortunately this is only available in the paid tier of Allegro... You are welcome to https://allegro.ai/enterprise/ the sales guys.
π
RoundMosquito25 good news, no no need to open any ports π
Basically B_i agents are always polling the server for "jobs" create an http/s request from them to the server, so all connections are out connections. Firewall is intact π
function and just seem to be getting an "isadirectory" error?
Can you post here what you are getting ? which clearml version are you using ?!
also tried manually adding
leap==0.4.1
in the task UI which didn't work.
That has to work, if it did not, can you send the log for the failed Task (or the Task that did not install it)?
The environment in the logs does show that leap is being installed potentially from a cache?
- leap @ file:///opt/keras-hannd...
Hi LonelyMoth90 , where exactly are you getting the error ? Is it trains-agent running your experiment ?
Manually I was installing the
leap
package through
python -m pip install .
when building the docker container.
NaughtyFish36 what happnes if you add to your "installed packages" /opt/keras-hannd
? This should translate to "pip install /opt/keras-hannd" which seems like exactly what you want, no ?