
Reputation
Badges 1
25 × Eureka!How do you handle private repos in clearml for packages?
Ohh, thanks! Will give it a shot now!
I added the following to the clearml.conf
file
agent {
package_manager: {
# supported options: pip, conda, poetry
type: pip,
extra_index_url: ["my_url"],
},
}
For some reason the changes were not reflected, here are the logs from the agent:
agent.package_manager.type = pip
agent.package_manager.pip_version.0 = <20.2 ; python_version < '3.10'
agent.package_manager.pip_version.1 = <22.3 ; python_version >\= '3.10'
agent.package_manager.sys...
I set my local laptop as an agent for testing purposes. I run the code on my laptop, it gets sent to the server which sends it back to my laptop. So the conf file is technically on the worker right?
The extra_index_url
is not even showing..
I am using the latest version clearml server
and I am using version 1.9.1
for the sdk.
Here is the code that I am currently using:
if __name__ == "__main__":
# create clearml data processing task
dataset = Dataset.create(
dataset_name="palmer_penguins",
dataset_project="palmer penguins",
dataset_tags=["raw"]
)
dataset_path = "data/raw/penguins.csv"
# add the downloaded files to the current dataset
dataset.add_files(path=dataset_pa...
@<1523701205467926528:profile|AgitatedDove14> So I was able to get it to pull the package by defining packages=
None
The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.
I tried getting around it by defining the environment PIP_INDEX_URL
and passing it using log_os_environments
in the clearml.conf
and I am now getting this message:
md-ap-feature-engineering/.venv/lib/p...
My server is hosted on AWS Fargate
The community server is working again.
Hi AgitatedDove14 ,
I am planning to use terraform to retrieve the secrets from AWS, after I retrieve the user list from the secrets manager, I am going to pass them as Environment variables.
The reason I am passing them as environment variables is that, I couldn't find a way to automatically upload files to AWS EFS from Terraform. Since the config file needs to be mounted as an EFS volume to the ECS task definition.
I was able to make the web authentication work while passing the followi...
SuccessfulKoala55 That seemed to do the trick, thanks for your help! 😄
Thanks @<1523701205467926528:profile|AgitatedDove14>
Not exactly, the dataset gets called in the script using Dataset.get() and the second dataset is an output dataset using Dataset.create().. Which means that dataset_1 is a parent dataset of dataset_2.
Thank you so much for your reply, will give that a shot!
I am currently running the scripts on WSL ubuntu
So what's the point of the alias? It's not very clear.. Even after specifying an alias I am still getting the following message: Dataset.get() did not specify alias. Dataset information will not be automatically logged in ClearML Server
Right so I figured out why it was calling it multiple times. Everytime a dataset is serialiazed, it calls the _serialize()
function inside of clearml/datasets/dataset.py
file, the _serialize()
method calls self.get(parent_dataset_id)
which is the same get()
method. This means that the user will always be prompted with the log, even if they are not "getting" a dataset. So anytime a user creates, uploads, finalizes a dataset, they will be prompted with the message...
Thanks for the reply. I was trying out this feature on a dummy example. I used the following commanddataset = Dataset.get( dataset_project="palmer penguins", dataset_name="raw palmer penguins", alias="my_test_alias_name", overridable=True)
That was the only time I called the get()
command. I still got the message that I should specify the alias. I can try and do a bit of debugging to see why it gets called multiple times.
I knew that, I was just happy that we have an updated example 😁
The thing is, even on the community server, not all the datasets have automatic previews. So for the same code/dataset, some of the runs have previews and some of them don't.
I'm actually trying that as we speak 😛
Nevermind, I figured out the problem. I needed to specify the --docker
flag when running the clearml-agent
Just waiting for the changes to be completed
I was able to resolve the issue. I am currently using clearml on wsl2 and my machine is connected to a vpn that allows me to connect on to the clearml instance hosted on AWS. You were right it was a network issue, I was able to resolve it by modifying my /etc/resolv.conf
file.
So I added the snippet above to the code,
and now the preview for the first 10 rows shows up. However, the automatic preview is still not working.
Above is the response for the events.debug_images
The above output is on the clearml community server
Hi again @<1523701435869433856:profile|SmugDolphin23> ,
I was able to run the pipeline remotely on an agent, but I am still facing the same problem with the code breaking on the exact same step that requires the docker container. Is there a way to debug what is happening? Currently there is no indication from the logs that it is running the code in the docker container. Here are the docker related logs:
agent.docker_pip_cache = /home/amerii/.clearml/pip-cache
agent.docker_apt_cache =...