Reputation
Badges 1
19 × Eureka!@<1523701070390366208:profile|CostlyOstrich36> You didn't understand me)) I want to push on the agent one function from code , wait some calculations, and continue code. I don't need to push a whole script
Thanks, I've seen this option. I thought it would be possible to do it via Task. I tried the method via Pipeline. The pod is lifted, the library is installed, but at the end there is no file. What could be the problem, can you tell me? Here are the logs: Environment setup completed successfully
Starting Task Execution:
/root/.clearml/venvs-builds/3.10/bin/python: can't open file '/root/.clearml/venvs-builds/3.10/task_repository/clearml-agent.git/test_remote_execution.py': [Errno 2] No such fi...
I investigated that in such path there is no script. Where should it be?
Pod easily can download dataset, upload to fileserver logs, but can't upload model 😀
I'm currently unsure about the correct approach. Would you kindly review my attempts and point out where I might have made a mistake? Here's what I've tried:
- I've added the default url in agent helm chart
clearml:
...
clearmlConfig: |-
sdk {
development {
default_output_uri: "
"
}
}
- I've added url in agent section:
agentk8sglue:
...
fileServerUrlReference:
- In the Python fil...
But it's weird. If I want to run the code without a repository, for example through "execute_remotely" or through "add_function_step", because by default it is assumed that the repository is not needed, isn't it so?
@<1523701070390366208:profile|CostlyOstrich36> I may not fully understand the functionality of remote code execution. Do I always need to have a git repository for this?
I run code from pod created by agent and model has been uploaded. But when task was started by agent command it doesn't upload) Magic
I didn't save it in any way. I relied on the auto-save from Clearml
@<1523701070390366208:profile|CostlyOstrich36> Thanks for response. Can I ask a second question? I have script main.py in my docker image in this path "/", but when clearml starts my container on agent it tries to execute in this path "/root/.clearml/venvs-builds/3.10/code/" Do you know how to change this behavior? For example I tried the --cwd argument, but Clearml-task tells me that "repository(Error: working directory '{}', must be relative to repository root)", but I don't use repository...
And about agent... Agent is listening queue, but the problem that I cant put in queue task without --script or module, here is a code of clearml-task " if raise_on_missing_entries and not base_task_id:
if not script and not module:
raise ValueError("Entry point script not provided")
if not repo and not folder and (script and not Path(script).is_file()):
raise ValueError("Script file '{}' could not be found".format(script))" But wh...
As I understand its CLEARML_AGENT_FORCE_CODE_DIR? From documentation I try to understand, should I specify these variables in agent Dockerfile or I can dynamically specify it?
Yep, it's inside repo. The steps are like in documentation
clearml-task --project examples --name remote_test --script /path/to/my/script.py
--packages "keras" "tensorflow>2.2" --args epochs=1 batch_size=64
--queue dual_gpu
I have found the solution. I should store script outside git repo, but I think autodetection in ClearML is not a good choice, there is should be flag like --no_autodetect_git
Ok, guys, I done it, by manually uploading model.task = Task.init(project_name='test', task_name='PyTorch MNIST train filserver dataset')
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="
None ")
tmp_dir = os.path.join(gettempdir(), "
mnist_cnn.pt ")
torch.save(model.state_dict(), tmp_dir)
output_model.update_weights(weights_filename=tmp_dir)
Hi @<1523701070390366208:profile|CostlyOstrich36> , I tried this, but It doesn't work, should it be fileserver url?
Ok, I found out that using scikit-learn the model is uploading, but pytorch doesn't.
Ok, maybe someone knows: how does a pod created by a K8s agent know the model registry URL? When I added the output_uri parameter in the Task, like output_uri=" None ", it doesn't show anything now. Previously, without this parameter, it showed a path like " None ...." in WebUI->Experiments->Artifacts