Reputation
Badges 1
60 × Eureka!hey @<1523701827080556544:profile|JuicyFox94>
standard standalone Linux using compose
the application is functional on localhost for sure
I'm looking at iptables configuration that was done by other teams
trying to find which rule blocks clearml
(all worked when iptables disabled)
oh boy, how much I hate reverse engineer of setup not I did 😞
I'll dig in more
{
"id": "c2e33466fd9de1a0d2be8d803ad4fbed",
"company": "d1bd92a3b039400cbafc60a7a5b1e52b",
"name": "John Doe",
"family_name": "John",
"given_name": "Doe",
"created": "2024-09-24T06:13:59.956000+00:00"
}
this is how the object looks like
yep that was my approached with no luck so far
hopefully someone from the ClearML dev team can give their input on this
in the UI I also see the display name, so I pulled all the users info, and match name to id
how do you know the ID?
I'm looking for this ...
correct, but!
I wrote a script that pulls tasks and limit for user
so I'm looking for users to knows their own id
in advance
for some reason it's not in REST API docs, but I usedusers.get_all
tried with my user and edited existing user record in apiserver.conf
it looks ClearML treated this as new user - I did not saw any of the jobs belongs to my user before the change
we will probably end up pulling the images from docker.io and pushing those to our container registry
@<1523701070390366208:profile|CostlyOstrich36> unfortunately, this is not the behavior we are seeing
same exact issue happen tonight
on epoch number 53 ClearML were shut down, the job did not continue to epoch 54 and eventually got killed with watchdog timer
you are correct and thank you for the reply @<1523701070390366208:profile|CostlyOstrich36>
going forward, I assume the clearml-server open-source releases will be continue to be released in Docker Hub
Hi VivaciousPenguin66
thanks for sharing, giving it a try now
after you set up webserver to point to 443 with HTTPS, what have you done with rest of http services clearml is using?
does weberver with 8080 remained accessible and your are directing to it in your ~clearml.conf
?
what about apiserver and file server? (8008 & 8081)
I didn't saw anything useful in elasic/mongo/api
I do significany slowness to query also my experiments
no filtering for sure
if I send link to task, sometimes it loads and sometimes it's stuck
AgitatedDove14 indeed there are few sub projects
do you suggest to delete those first?
I'd guess mongo is choking, not sure why
I think there are some experiments that are messing up mongodb
this logs unusual in clearml-mongo logs:
{"t":{"$date":"2023-09-19T12:15:50.685+00:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn73","msg":"Slow query","attr":{"type":"command","ns":"backend.model","command":{"distinct":"model","key":"project","query":{"$and":[{"$or":[{"company":{"$in":["d1bd92a3b039400cbafc60a7a5b1e52b",null,""]}},{"company":{"$exists":false}}]},{"user":{"$in":["197aea8467d3f471fc0db98b57ed80fa"]...
app.component.ts:138 ERROR TypeError: Cannot read properties of null (reading 'id')
at projects.effects.ts:60:56
at o.subscribe.a (switchMap.js:14:23)
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
at withLatestFrom.js:26:28
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
at filter.js:6:128
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
and I see also when trying to...
not sure it's same use case but I will begin to ask around people
if you have any other hint/way how to query mongo and look for potential culprit - will be glad to hear
api calls behaves much better
no problem to query tasks in other projects
to be honest, the use case is mostly convenience
when people train ~5000+ experiments, all saved in few sub folders with long string as experiment name
before publishing a paper for example, we want to move copy small numbers of successful training to separate location and share it with other colleagues/management
I'd guess the alternative can be
changing the name of the successful training under the existing sub folder
using move instead of clone
anything else?
ok, hopefully someone will share some thoughts and how it went 🙂
thanks @<1523701070390366208:profile|CostlyOstrich36>
I've done this successfully using the API already
as for the sdk option - in which format should I provide the list of tasks/projects to the sdk?
foronly_fields=["id", "name","created","status_changed","status", "user"],
:
output example
{'id': '02a3f5929cf246138994c9243a692219', 'name': 'docfm_v7_safe_32gpu80g_11Jan24_4w', 'created': datetime.datetime(2024, 1, 11, 9, 54, 33, 406000, tzinfo=tzutc()), 'status_changed': dateti...
so I have large json, with list of task id's
which I want to delete in bulk
API is doable
how about the sdk? how do I provide a list of tasks id's to for deletion
from the cleanup example:
for task in tasks:
try:
deleted_task = Task.get_task(task_id=task.id)
deleted_task.delete(
how do I set tasks
, while coming from known list of task id's
I know the 500 limit and using it
but my while
loop keeps pulling the same 500 ... and running endless
ohhh severe error here 🙂
I was mixed between other API I worked on .. and did not read carefully the right flag
simple adding page
to the body did the work
thanks again @<1724235687256920064:profile|LonelyFly9>