For on-premise deployment with premium features we have the enterprise plan š
Hey @<1554275802437128192:profile|CumbersomeBee33> , aborted usually means that someone manually stopped the pipeline or one of it's experiments. Can you provide us with the code you used to run it?
You can try to add the force_download=True flag to .get() to ignore the locally cached content. Let me know if it helps.
The issue may be related to the fact that right now we have some edge cases when working with lightning >= 2.0, we should have better support in the upcoming release
Hey @<1564422650187485184:profile|ScaryDeer25> , we just released clearml==1.11.1rc2 which should solve the compatibility issues for lightning >= 2.0. Can you install it and check whether it solves your problem?
Can you also tell what OS are you using? And when you mentioned that the clearml version: 1.5.1 did you mean the ClearML package or the clearml-agent package? Because they are different
Yes, metrics can be saved in both steps and pipelines. As for project dashboards, I think as of now we don't support them in UI for pipelines. But what you can do instead is to run a special "reporting" Task that will query all the pipeline runs from a specific project, and with it you can then manually plot all the important information yourself.
To get the pipeline runs, please see documentation here: [None](https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelineco...
I think you can set the cuda version in the clearml.conf , alternatively you can have the agent use a docker image with your required version of cuda instead of setting the environment directly on the machine
@<1637624992084529152:profile|GlamorousChimpanzee22> using localhost I'm assuming it's minio, is the s3 path you're trying to access something like this: None <some file or dir> ?
Hey @<1569858449813016576:profile|JumpyRaven4> , about your first point, what exactly is the question?
About your second point - you can try to manually save the final model and give it a proper file name, that way we will show it in the UI with the name you provided. Make sure to use xgboost.save_model and not raw pickle.
For your final question , given that your models have customised code, I can suggest trying to use clearml.OutputModel which will register the file you provide ...
This is the method you're looking for None . But make sure you have a model saved on disk before using it. And if you don't want the model to be deleted from disk after it, make sure to set auto_delete_file=False
Hello @<1533257278776414208:profile|SuperiorCockroach75> , thanks for asking. Itās actually unsupervised, because modern LLMs are all trained to predict next/missing words, which is an unsupervised method
clearml-data also supports glob patterns, so if you have your dataset files in the same directory as the experiment code, you can do something like clearml-data add --files *.csv and only add the CSV files.
There's no .gitignore-like functionality because clearml-data is not meant to track everything, and you need to be deliberate in what exactly you're adding. Hope this clarifies things.
Hey @<1523701083040387072:profile|UnevenDolphin73> , sorry for late reply, Iām investigating now the issue that you mentioned that running a remote task with create_function_task fails. I canāt quite reproduce it, can you please provide a complete runnable code snippet that fails like you just described
If your git credentials are stored in the agent's clearml.conf it means these are a HTTPS username/password pair. But you specified that the package should be downloaded via git ssh, for which I assume you don't have credentials in agent's environment. So it can't authenticate with SSH, and PIP doesn't know how to switch from git+ssh to git+https, because the downloading of the package is done by PIP not by clearml.
And there probably are auth errors if you scroll through the entire log ...
That is not specific enough. Can you show the code? And ideally also the console log of the pipeline
Hey @<1526734437587357696:profile|ShaggySquirrel23> , what version of the clearml-agent are you using? Also, if I were you Iād check how much free disk thereās on the machine running the agents
What happens if you comment or remove the pipe.set_default_execution_queue('default') and use run_locally instead of start_locally ?
Because in the current setup, you are basically asking to run the pipeline controller task locally, while the rest of the steps need to run on an agent machine. If you do the changes I suggested above, you will be able to run everything on your local machine.
Hey @<1661904968040321024:profile|SpotlessOwl43> that's a great question!
how the metric should be saved, via report_single_value?
That's correct
what should I enter into the title and series fields in Project Dashboard?
The title should be "Summary" and series is the name of the single value you reported
Which gives me an idea. Could you please remove the entrypoint from the docker image altogether and try again ?
Overriding the entrypoint in the image can lead to docker run/docker exec failing to work properly , because instead of a shell it will use your entrypoint to run everything
Hey @<1523701083040387072:profile|UnevenDolphin73> what you're building here sounds like a useful tool. Let me understand what you're trying to achieve here, please correct me if I'm wrong:
- You want to create a set of
Stepclasses with which you can define pipelines, that will be executed either locally or remotely. - The pipeline execution is triggered from a notebook.
- The
stepsare predefined transformations, the user normally won't have to create their own steps
Did I get all...
About the first question - yes, it will use the destination URI you set.
About the second point - did you archive or properly delete the experiments?
Hey @<1523701949617147904:profile|PricklyRaven28> , about the S3 loading issue. The path to the model in the artifact tab, is it an S3 bucket or a local path?
Hey @<1644147961996775424:profile|HurtStarfish47> , you can use S3 for debug images specifically , see here: https://clear.ml/docs/latest/docs/references/sdk/logger/#set_default_upload_destination but the metrics (everything you report like scalars, single values, histograms, and other plots) are stored in the backend. The fact that you are almost running out of storage could be because of either t...
Hey @<1678212417663799296:profile|JitteryOwl13> , just to make sure I understand, you want to make your imports inside the pipeline step function, and you're asking whether this will work correctly?
If so, then the answer is yes, it will work fine if you move the imports inside the pipeline step function
Could you please run the misbehaving example, try to add a breakpoint in clearml/backend_interface/task/task.py in Task.update_output_model on the line with url = output_model.update_weights( , and tell me what the value of model_path is? In case you're using virtual environments, clearml library should be installed somewhere in <virtual env directory>/lib/python3.10/site-packages/clearml/
What happens if you set the new project name to f"{config.project_id}" (notice, no .pipelines )?
Do you mean that you want your published experiments to be either āapprovedā or ānot approvedā based on the presence of the attachments you mentioned ?