Reputation
Badges 1
25 × Eureka!Hi UnsightlySeagull42
Just making sure, the two scripts are on your git repo ?
Great!
I'll make sure the agent outputs the proper error π
hmm I assume the reason is the cookie / storage changed?
You mean does one solution is better than combining maintaining and automating 3+ solutions (dvc/lakefs + mlflow + cubeflow/airflow)
Yes I'd say it is. BTW if you have airflow running for other automations you can very easily combine the automation with clearml and have a single airflow automation for everything, but the main difference now airflow only launches logic, never actual compute/data (which are launched and scaled via clearml
Does that make sense?
100% of things withΒ
task_overrides
Β would be the most convenient way
I think the issue is that you have to pass the project ID not project name (the project unique IS is the property that is actually stored on the Task)
@<1523707653782507520:profile|MelancholyElk85> can you check the following works:
pipe.add_task(, ..., task_overrides={'project': Task.get_project_id(project_name='examples')},)
Okay there should not be any difference ... π
Could you run your code not from the git repository.
I have a theory, you never actually added the entry point file to the git repo, so the agent never actually installed it, and it just did nothing (it should have reported an error, I'll look into it)
WDYT?
@<1523707653782507520:profile|MelancholyElk85>
What's the clearml
version you are using ?
Just making sure... base_task_id has to point to a Task that is in "draft" mode, for the pipeline to use it
ReassuredTiger98 I'm trying to debug what's going on, because it should have worked.
Regrading Prints ...
` from clearml import Task
from time import sleep
def main():
task = Task.init(project_name="test", task_name="test")
d = {"a": "1"}
print('uploading artifact')
task.upload_artifact("myArtifact", d)
print('done uploading artifact')
# not sure if this helps but it won'r hurt to debug
sleep(3.0)
if name == "main":
main() `
Hi @<1557899668485050368:profile|FantasticSquid9>
There is some backwards compatibility issue with 1.2 (I think).
Basically what you need it to spin a new one on a new session ID and rergister the endpoints
Ohh if this is the case, you might also consider using offline mode, so there is no need for backend
https://clear.ml/docs/latest/docs/guides/set_offline#setting-task-to-offline-mode
Add '/' , like you would with a file system.Task.init(project_name='main_project/sub_project', task_name='test')
does this work for multiple levels?
Yep π
Yes you can drag it in the UI :) it's a new feature in v1
BTW: I think we had a better example, I'll try to look for one
In theory yes, in practice you will be using the same docker image for all the services, and they will never interfere with one another. and you have the option to do more sophisticated stuff, like map the file-server data for a clean up service (should be out in a few days :)) so a balance. Also remember that relatively speaking docker are quite light weight, this is not like saying a VM per service...
The warning just let's you know the current processes stopped and itis being launched on a remote machine.
What am I missing? Is the agent failing to run the job that you create manually ?
(notice that when creating a job manually, there is no "execute_remotely", you just enqueue it, as it is not actually "running locally")
Make sense ?
@<1523707653782507520:profile|MelancholyElk85> I just run a single step pipeline and it seemed to use the "base_task_id" without cloning it...
Any insight on how to reproduce ?
WackyRabbit7 How do I reproduce it ?
Hi @<1742355077231808512:profile|DisturbedLizard6>
the problem maybe in returning None in get_local_model_file()
This tracks, it means that the model file cannot be downloaded for some reason,
when you click on the model here: None
what doe sit say under "MODEL URL:"?
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F07SV39P6R3/________________...
Hi ReassuredOwl55
How would I find Tasks that have the same code with different inputs/parameters?
Assuming you have the git repo
you can do:Task.query_tasks(..., task_filter={'_all_'=dict(fields=['script.repository'], pattern='github.com/user/repo'))
wdyt?
BattyLion34 is this running with an agent ?
What's the comparison with a previously working Task (in terms of python packages) ?
Hmm HandsomeGiraffe70
This seem like a bug, let me see what we can do about that π
could it be the parent version was created with an older version of clearml sdk ?
In Windows settingΒ
system_site_packages
Β toΒ
true
Β allowed all stages in pipeline to start - but doesn't work in Lunux.
Notice that it will inherit from the system packages not the venv the agent is installed in
I've deleted tfrecords from master branch and commit the removal, and set the folder for tfrecords to be ignored in .gitignore. Trying to find, which changes are considered to be uncommited.
you can run git diff
it is essentially...
yes.
Obviously when you import the offline session, you will need to set it to point to your server with the correct credentials
Specifically your error seems to be an issue with nvidia Triton container upgrade
Hi MelancholyBeetle72
You mean the venv creation takes the bulk of the time, or it something else ?
The class documentation itself is also there under "References" -> "Trains Python Package"
Notice that due to a bug in the documentation (we are working on a fix) the reference part is not searchable in the main search bar