
Reputation
Badges 1
64 × Eureka! {
"id": "c2e33466fd9de1a0d2be8d803ad4fbed",
"company": "d1bd92a3b039400cbafc60a7a5b1e52b",
"name": "John Doe",
"family_name": "John",
"given_name": "Doe",
"created": "2024-09-24T06:13:59.956000+00:00"
}
this is how the object looks like
yep that was my approached with no luck so far
hopefully someone from the ClearML dev team can give their input on this
how do you know the ID?
I'm looking for this ...
correct, but!
I wrote a script that pulls tasks and limit for user
so I'm looking for users to knows their own id
in advance
I built an basic nginx container
` FROM nginx
COPY ./default.conf /etc/nginx/conf.d/default.conf
COPY ./includes/ /etc/nginx/includes/
COPY ./ssl/ /etc/ssl/certs/nginx/ copied the signed certificates and the modified nginx
deafult.conf `
the important part is to modify the compose file to redirect all traffic to nginx container
` reverse:
container_name: reverse
image: reverse_nginx
restart: unless-stopped
depends_on:
- apiserver
- webserver
- fil...
Hi VivaciousPenguin66
thanks for sharing, giving it a try now
after you set up webserver to point to 443 with HTTPS, what have you done with rest of http services clearml is using?
does weberver with 8080 remained accessible and your are directing to it in your ~clearml.conf
?
what about apiserver and file server? (8008 & 8081)
I have tried some small task only uploads single file
logger = task.get_logger()
img = Image.open(f"./1_model.png").convert("RGB")
logger.report_image(title=f"cfg_0", series="Model", iteration=1, image=img)
ended with:
Retrying (Retry(total=0, connect=5, read=5, redirect=5, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)'))': /
202...
the application is functional on localhost for sure
the way I do pagination is wrong
@<1523701070390366208:profile|CostlyOstrich36> might throw some champions tip over here 🙂
OK I got everything to work
I think this script can be useful to other people and will be happy to share
@<1523701070390366208:profile|CostlyOstrich36> is there some repo I fork and contribute?
I know the 500 limit and using it
but my while
loop keeps pulling the same 500 ... and running endless
offset = 0
limit = 500
all_data = []
while True:
params = {
'offset': offset,
'limit': limit
}
response = requests.get(url, headers=headers, params=params,verify=False)
data = response.json()
projects = data['data']['projects']
print(f"pulled {len(projects)} projects.")
if len(projects) == 0:
print("no project found - exiting ...")
break
all_data.extend(projects)
offset += limit
print(f"pulled {le...
ohhh severe error here 🙂
I was mixed between other API I worked on .. and did not read carefully the right flag
simple adding page
to the body did the work
thanks again @<1724235687256920064:profile|LonelyFly9>
@<1523701435869433856:profile|SmugDolphin23> working! here is what I have on Fedora/RHEL
- copy certs to
/etc/pki/ca-trust/source/anchors/
update-ca-trust
you are correct and thank you for the reply @<1523701070390366208:profile|CostlyOstrich36>
going forward, I assume the clearml-server open-source releases will be continue to be released in Docker Hub
we will probably end up pulling the images from docker.io and pushing those to our container registry
app.component.ts:138 ERROR TypeError: Cannot read properties of null (reading 'id')
at projects.effects.ts:60:56
at o.subscribe.a (switchMap.js:14:23)
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
at withLatestFrom.js:26:28
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
at filter.js:6:128
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
and I see also when trying to...
I didn't saw anything useful in elasic/mongo/api
I do significany slowness to query also my experiments
no filtering for sure
if I send link to task, sometimes it loads and sometimes it's stuck
to be honest, the use case is mostly convenience
when people train ~5000+ experiments, all saved in few sub folders with long string as experiment name
before publishing a paper for example, we want to move copy small numbers of successful training to separate location and share it with other colleagues/management
I'd guess the alternative can be
changing the name of the successful training under the existing sub folder
using move instead of clone
anything else?
hey @<1523701827080556544:profile|JuicyFox94>
standard standalone Linux using compose
AgitatedDove14 indeed there are few sub projects
do you suggest to delete those first?
@<1523701070390366208:profile|CostlyOstrich36> unfortunately, this is not the behavior we are seeing
same exact issue happen tonight
on epoch number 53 ClearML were shut down, the job did not continue to epoch 54 and eventually got killed with watchdog timer
@<1523701070390366208:profile|CostlyOstrich36> sorry for not being clear enough
when is next version of clearml-server will be released? I can see last version is from August, is there any ETA for new release in upcoming 1-2 month?
looking into ES index events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b
docs.count docs.deleted store.size pri.store.size
2118131043 29352476 265.1gb 265.1gb
sounds we're hitting some ES limitation?
unfortunately I couldn't fix this
the ES state in hectic, can't delete anything
clearml is still live, read-only mode, all existing indices are readable
new jobs can't write to this clearml server
yep, again most jobs works .. the issue with when a job tries to upload artifacts to fileserver
thanks @<1523701070390366208:profile|CostlyOstrich36>
I've done this successfully using the API already
as for the sdk option - in which format should I provide the list of tasks/projects to the sdk?
foronly_fields=["id", "name","created","status_changed","status", "user"],
:
output example
{'id': '02a3f5929cf246138994c9243a692219', 'name': 'docfm_v7_safe_32gpu80g_11Jan24_4w', 'created': datetime.datetime(2024, 1, 11, 9, 54, 33, 406000, tzinfo=tzutc()), 'status_changed': dateti...
so I have large json, with list of task id's
which I want to delete in bulk
API is doable
how about the sdk? how do I provide a list of tasks id's to for deletion
from the cleanup example:
for task in tasks:
try:
deleted_task = Task.get_task(task_id=task.id)
deleted_task.delete(
how do I set tasks
, while coming from known list of task id's
not sure it's same use case but I will begin to ask around people
if you have any other hint/way how to query mongo and look for potential culprit - will be glad to hear
I think there are some experiments that are messing up mongodb
this logs unusual in clearml-mongo logs:
{"t":{"$date":"2023-09-19T12:15:50.685+00:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn73","msg":"Slow query","attr":{"type":"command","ns":"backend.model","command":{"distinct":"model","key":"project","query":{"$and":[{"$or":[{"company":{"$in":["d1bd92a3b039400cbafc60a7a5b1e52b",null,""]}},{"company":{"$exists":false}}]},{"user":{"$in":["197aea8467d3f471fc0db98b57ed80fa"]...