Reputation
Badges 1
25 × Eureka!Hi SuperficialGrasshopper36
/home/ubuntu/.clearml/venvs-builds.1/3.8/task_repository/repository_name/.venv
This is the problem, they should not be installed there, it should be in/home/ubuntu/.clearml/venvs-builds.1/3.8/
Could you post the poetry.lock file? Maybe it is something there?
What's the poetry version and cleaml-agent versions ?
Hi @<1727497172041076736:profile|TightSheep99>
Yes it can, it will upload the meta-data as well as the files (it will also do de-dup and will not upload files that already exist in the dataset based on the hash of teh file content)
AstonishingSeaturtle47 I think there's a workaround for the GitHub multiple repo issue. See https://gist.github.com/gubatron/d96594d982c5043be6d4
GiganticTurtle0 I'm not sure I follow, what do you mean by indexing the arguments? Can you post a short usage example ?
Hi CrookedAlligator14
Hi, I just started using clearml, and it is amazing!
Thank you! π
When I enqueue the task, the venv is setup and starts to install all the packages from the
requirements.txt
file, but at the end I get the following in the console:
Can you try with the latest agent, we improved the support for pytorch (they now have a proper pypi compatible repo), can you see if that solves it?pip3 install clearml-agent==1.5.0rc0
Hi EagerOtter28
Let's say we query another time and get 60k images. Now it is not trivial to create a new dataset B but only upload the diff: ...
Use Dataset.sync (or clearml-data sync) to check which files where changed/added.
All files are already hashed, right? I wonder whyΒ
clearml-data
Β does not keep files in a semi-flat hierarchy and groups them together to datasets?
It kind of does, it has a full listing of all the files with their hash (SHA2) values, ...
So now for it to take place you need to enqueue the Task and set an agent to pick it up and run it.
When the agent is running the Task the new parameter will be passed.
does that make sense ?
EnviousStarfish54
it seems that if I don't use plt.show() it won't show up in Allegro, is this a must?
Yes , at plt.show / plt.save Trains will capture the plot and send it to the backend.
BTW: when you hover over the empty plot area, do you see the plotly objects, or is it all blank ?
Hi DilapidatedDucks58 ,
I'm not aware of anything of this nature, but I'd like to get a bit more information so we could check it.
Could you send the web-server logs ? either from the docker or the browser itself.
Seems like everything is in order. Can you curl to the API/web/files server?
When you say status, what do you mean? Is it active? Running a task?
The thing I don't understand is how come this DOES work on our linux setups
I do not think it actually works... I could not have find a code that will convert the ENV in the config string ...
I'll be happy to test it out if there's any commit available?
Please do, and feel free to PR it π
https://github.com/allegroai/clearml/blob/d3e986393ac8d1a1ea48302224962570ab8e6f9e/clearml/backend_api/session/session.py#L576
https://github.com/allegroai/clearml/blob/d3e98639...
This is very odd ... let me check something
Working on it as we speak π Hopefully in the next release (probably next week)
Yes, sorry, that wasn't clear π
Hi ReassuredTiger98
Good point, since the user actually "running" the code is the agent, all the api calls are registered under its name, including the Model creation.
This is a good point, though ...
I know the enterprise tiers add "impersonate" as part of the security layer, meaning that the agent is Not actually running the code but the creating "user" is, which solve this problem. I'm not sure what actually can be done without this feature... thoughts?
Agent works when I am running it from virtual environment but stucks in the same place all the time when I using Docker
Can you please provide a log? I'm not sure what it means stuck
you should have something like 192.168... or 10.0 ....
And maybe adding idle time spent without a job to API is not that a bad idea π
yes, adding that to the feature list π
What if I write the last active state in an instance tag? This could be a solutionβ¦
I love this hack, yes this should just work.
BTW: if you lambda is a for loop that is constantly checking there is no need to actually store "last idle timestamp check as tag", no?
- Components anyway need to be available when you define the pipeline controller/decorator, i.e. same codebaseNo you an specify a different code base, see here:
None - The component code still needs to be self-composed (or, function component can also be quite complex)Well it can address the additional repo (it will be automatically added to the PYTHONPATH), and you c...
the issue moving forward is if we restart the pod we will have to manually update that again.
Can't you map the nginx configuration file ? (making the changes persistent across pods)
so it would be better just to use the original code files and the same conda env. if possibleβ¦
Hmm you can actually run your code in "agent mode" assuming you have everything else setup.
This basically means you set a few environment variables prior to launching the code:
Basically:export CLEARML_TASK_ID=<The_task_id_to_run> export CLEARML_LOG_TASK_TO_BACKEND=1 export CLEARML_SIMULATE_REMOTE_TASK=1 python my_script_here.py
Okay, I'll pass to front-end, see what they can do about it.
Hmm yes we should probably provide metrics:client.workers.get_stats(..., items=[dict(key='cpu_usage'), dict(key='gpu_usage')])
It is recommended to create a git TOKEN with read only permissions and use it (more secure) π
My current experience is there is only print out in the console but no training graph
Yes Nvidia TLT needs to actually use tensorboard for clearml to catch it and display it.
I think that in the latest version they added that. TimelyPenguin76 might know more