Reputation
Badges 1
16 × Eureka!I didn't save it in any way. I relied on the auto-save from Clearml
@<1523701070390366208:profile|CostlyOstrich36> You didn't understand me)) I want to push on the agent one function from code , wait some calculations, and continue code. I don't need to push a whole script
Yep, it's inside repo. The steps are like in documentation
clearml-task --project examples --name remote_test --script /path/to/my/script.py
--packages "keras" "tensorflow>2.2" --args epochs=1 batch_size=64
--queue dual_gpu
I have found the solution. I should store script outside git repo, but I think autodetection in ClearML is not a good choice, there is should be flag like --no_autodetect_git
Ok, maybe someone knows: how does a pod created by a K8s agent know the model registry URL? When I added the output_uri parameter in the Task, like output_uri=" None ", it doesn't show anything now. Previously, without this parameter, it showed a path like " None ...." in WebUI->Experiments->Artifacts
Ok, guys, I done it, by manually uploading model.task = Task.init(project_name='test', task_name='PyTorch MNIST train filserver dataset')
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="
None ")
tmp_dir = os.path.join(gettempdir(), "
mnist_cnn.pt ")
torch.save(model.state_dict(), tmp_dir)
output_model.update_weights(weights_filename=tmp_dir)
I run code from pod created by agent and model has been uploaded. But when task was started by agent command it doesn't upload) Magic
Hi @<1523701070390366208:profile|CostlyOstrich36> , I tried this, but It doesn't work, should it be fileserver url?
Thanks, I've seen this option. I thought it would be possible to do it via Task. I tried the method via Pipeline. The pod is lifted, the library is installed, but at the end there is no file. What could be the problem, can you tell me? Here are the logs: Environment setup completed successfully
Starting Task Execution:
/root/.clearml/venvs-builds/3.10/bin/python: can't open file '/root/.clearml/venvs-builds/3.10/task_repository/clearml-agent.git/test_remote_execution.py': [Errno 2] No such fi...
But it's weird. If I want to run the code without a repository, for example through "execute_remotely" or through "add_function_step", because by default it is assumed that the repository is not needed, isn't it so?
@<1523701070390366208:profile|CostlyOstrich36> I may not fully understand the functionality of remote code execution. Do I always need to have a git repository for this?
I investigated that in such path there is no script. Where should it be?
Pod easily can download dataset, upload to fileserver logs, but can't upload model 😀
Ok, I found out that using scikit-learn the model is uploading, but pytorch doesn't.
I'm currently unsure about the correct approach. Would you kindly review my attempts and point out where I might have made a mistake? Here's what I've tried:
- I've added the default url in agent helm chart
clearml:
...
clearmlConfig: |-
sdk {
development {
default_output_uri: "
"
}
}
- I've added url in agent section:
agentk8sglue:
...
fileServerUrlReference:
- In the Python fil...