AgitatedDove14 AFAIK, ClearML sends the git repo, branch and commit id IF the a git repo is present at the working directory, without needing me to specify it. When it does send those information, clearml agent tries to pull the repo with the specified branch and commit id, and the project goes on after that. This is what I meant by mentioning "git integration". If a git repo is not present at the working directory, clearml agent just bypasses the "pulling the repo" part, as there is none specified. Instead, it uploads my script only.
Oddly, clearml agent is unable to run the script if a git repo is present and marks the experiment as complete instantly. If a git repo is not present, it runs the script perfectly as intended.
AgitatedDove14 I have a training job, very similar to this one: https://clearml.slack.com/archives/CTK20V944/p1654606983176539?thread_ts=1654604976.568279&cid=CTK20V944
As soon as I launch the job with git, the task marks itself as completed without launching the actual job, even if I mount the volume as I do without git.
ZanyPig66 what do you mean with "git integration " ? So what would be two ways of calling the function, where one works and the other does not?
Can you also try specifying the branch/commit?
AgitatedDove14 Certainly! This completely aligns with my observations. However, this one should be a feature to work on, and should be fairly easy to implement.
Okay, after lots of trials and failures, I found that the execution script should be on git too. The changes are being sent by clearml automatically, but the files that do not exist within the repo are apparently are not being sent. This is somehow counter-intuitive.
AgitatedDove14 To elaborate, the code below does not work with git integration activated.
` from clearml import Task
task = Task.create(
project_name="deneme",
task_name="git deneme",
packages=["protobuf==3.20.0"],
docker="databossds/easyvision",
docker_args="-v /home/user/awesome_dir:/workspace",
add_task_init_call=True,
script="train.py",
)
Task.enqueue(task, "default") `
However, the very same code does work WITHOUT git integration activated.
ZanyPig66 this should have worked, any chance you can send the full execution log (in the UI "results -> console" download full log) and attach it here? (you can also DM it so it is not public)
ZanyPig66 it sounds like you need to add the docker args for binding, just add to the Task.create the argument: 'docker_args="-v /mnt/host:/mnt/container"'
However, when I try to bind a volume and run the code, everything runs perfectly.
Can you please elaborate on what this means?
However, this one should be a feature to work on, and should be fairly easy to implement.
Feel free to add as GitHub issue 🙂
Main challenge is understanding what needs to be added as "uncommitted changes"
The driver script (the one initializes models and initializes a training sequence) was not at git repo and besides that one, everything is.
Yes there is an issue when you have both git repo and totally uncommitted file, since clearml can store either standalone script or a git repository, the mix of the two is not actually supported. Does that make sense ?
AgitatedDove14 Sorry for the very late response. The driver script (the one initializes models and initializes a training sequence) was not at git repo and besides that one, everything is.
AgitatedDove14 CostlyOstrich36 That was exactly what I was doing ( docker_args="-v /mnt/host:/mnt/container
).
Ohh yes, if the execution script is not on git and git exists, it will not add it (it will add it if it is in a tracked file via the uncommitted changes section)
ZanyPig66 in order to expand the support to your case. Can you explain exactly which files are on git and which are not?
AgitatedDove14 And I'm sending the job via the specified code at the beginning of this thread.
ZanyPig66 you are correct in your assumptions. What exactly do you have in the Task? If there is no git repo the entire script should be under "uncommitted changes. What is your case?