
Reputation
Badges 1
59 × Eureka!SuccessfulKoala55 i tried comment off fileserver, clearml dockers started but it doesn't seems to be able to start well. When I access clearml via webbrowser, site cannot be reached.
Just to confirm, I commented off these in docker-compose.yaml.
apiserver:
command:
- apiserver
container_name: clearml-apiserver
image: allegroai/clearml:latest
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
`...
U want to share your clearml.conf here?
Hello CostlyOstrich36 I am facing an issue now. basically i installed all necessary python packages in my docker image. But somehow, the clearml-agent does not seems to be able to detect these global packages. I don't see them in the "installed packages". Any advice?
It return false. Just to share abit more, I have the requirements.txt in gitlab with my codes and are in folders. Do I need to provide a gitlab path?
This is what I got. and when I see http400 error in the console.
Not exactly sure yet but I would think user tag for deployed make sense as it should be a deliberated user action. And additional system state is required too since a deployed state should have some pre-requitise system state.
I would also like to ask if clearml has different states for a task, model, or even different task types? Right now I dun see differences, is this a deliberated design?
Thanks. The examples uses upload_artifact which stores the files in output_uri. What if I do not want to save it but simply pass to next step, is there a way to do so?
I see. Was wondering any advantage to do it any of the ways.
Thanks AgitatedDove14 and TimelyMouse69 . The intention was to have some traceability between the two setups. I think the best way is to enforce some naming convention (for project and name) so we can know how they are related? Any better suggestions?
Yup, was thinking of bash script.
The intent is to generate some outputs from the clearml task and thinking probably to package it into a docker image for ease of sharing to others that are not plug into our network and able to run the image directly.
It gets rerouted to http://app.clearml.home.ai/dashboard . with the same network error.
JuicyFox94 and SuccessfulKoala55 Thanks alot. Indeed it is caused by dirty cookies.
Just to add, when I run the pipeline locally it works as well.
Ok. Can I check that only the main script was stored in the task but not the dependent packages?
I guess the more correct way is to upload to some repo where the remote task can still pull from it?
Hi CostlyOstrich36 I have run this task locally at first. This attempt was successful.
When I use this task to run in a pipeline (task was run remotely), it cannot find the external package. This seems logical but I not sure how to resolve this.
Hi Bart, yes. Running with inference container.
By the way, will downloading still happen if the datasets is available in the cache folder? Any specific settings to add to Dataset.get_local_copy()?
Example i build my docker image using a image in docker hub. In this image, i installed torch and cupy packages. But when i run my experiment in this image, the packages are not found.
Yes, I ran the experiment inside.
I have yet to figure out how to do so, would appreciate if u could give some guidance
Thanks @<1523701205467926528:profile|AgitatedDove14> . what I could think of is to write a task that may run python subproecss to do "helm install". In those python script, we could point to /download the helm chart from somewhere (e.g. nfs, s3).
Does this sound right to u?
Anything that I was wondering is if we could pass the helm charts /files when we uses clearml sdk, so we could minimise the step to push them to the nfs/s3.
Hi SuccessfulKoala55 Thanks for pointing me to this repo. Was using this repo.
I didn't manage to find in this repo that if we still require to label the node app=clearml, like what was mentioned in the deprecated repo. Although from the values.yaml, the node selector is empty. Would u be able to advise?
How is the clearml data handled now then? Thanks
Cool thanks guys. I am clearer now. Was confused by the obsolete info. Thanks for the clarification.
I was browsing clearml agent gihub and saw this. Isn't this for spinning up clearml-agent in a docker and perform like a daemon?
Thanks I just realised I didn't add --docker
By the way, how can I start up the clearml agent using the clearml-agent image instead of SDK? Do u have an example of the docker run command that includes the queue, gpus etc?
SdK meaning I run the agent using clearml-agent daemon ....
Alternatively I understand I can also run the agent using docker run allegroai/clearml-agent:latest.
But I cannot figure out how to add --restart, --queue, -- gpus flag to the container