Reputation
Badges 1
83 × Eureka!While creating a GCP credentials using None
What values should I insert in the following step so that the autoscaler has access, as of now I left this field blank
thanks for the help though!!
So I am running a pipeline on a GCP VM, my VM has 1 NVIDIA GPU, and my requirements.txt has torch==1.13.1+cu117
torchvision==0.14.1+cu117
When I am running the Yolo training step I am getting the above error.
When the package installation is done in the task
I am not able to see cu117 there
2023-10-03 20:46:07,100 - clearml.Auto-Scaler - INFO - Spinning new instance resource='clearml-autoscaler-vm', prefix='dynamic_gcp', queue='default'
2023-10-03 20:46:07,107 - googleapiclient.discovery_cache - INFO - file_cache is only supported with oauth2client<4.0.0
2023-10-03 20:46:07,122 - clearml.Auto-Scaler - INFO - Creating regular instance for resource clearml-autoscaler-vm
2023-10-03 20:46:07,264 - clearml.Auto-Scaler - INFO - --- Cloud instances (0):
2023-10-03 20:46:07,482 - clearm...
I provided the credentials while setting up the autoscaler instance, where can I look for the clearml.conf. When I ssh into the instance, spin up by the autoscaler, I am not able to see the clearml.conf
Let me know if this is enough information or not
Ok I was able to resolve the above issue, but now I am getting the following error while executing a task
import cv2
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/cv2/init.py", line 181, in <module>
bootstrap()
File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/cv2/init.py", line 153, in bootstrap
native_module = importlib.import_module("cv2")
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _boots...
while we spin up the autoscaler instance
Because I think I need to have the following two lines in the .bashrc and the Google_Application_credentials
git config --global user.email 'email'
git config --global user.name "user_name"
And one more thing is there a way to make changes to the .bashrc which is present inside the docker container
Note: switching to 'commit_id'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting ...
If you can let me know @<1576381444509405184:profile|ManiacalLizard2> @<1523701087100473344:profile|SuccessfulKoala55> how to resolve this, that would be very much helpful
All I need to do is
pip install -r requirements.txt
pip install .
I am able to run the pipeline locally though
individual steps are failing
And also I have a requirements file which I want to be installed when I run the pipeline remotely
I want to know how to execute pip install . to import all the custom packages
So what I want to do is import the custom packages into my remote execution
I am able to get the requirements installed for each task
So the issue I am facing is, I am running the pipeline controller task on my local system agent and the steps of the pipeline on an agent running on GCP VM, the first step of the pipeline is failing showing clearml_agent: ERROR: Failed cloning repository.
so you mean when i ssh into my VM i need to do a git clone and then spin up the agent, right?
Ok, it's cloning but it's asking for my github credentials
Do I need not make changes into clearml.conf so that it doesn't ask for my credentials or is there another way around
dataset = fo.Dataset.from_dir(
labels_path=labels_path,
dataset_type=fo.types.COCODetectionDataset,
label_field="ground_truth",
use_polylines=True
)
task.upload_artifact(
name="Dataset",
artifact_object=dataset,
)