
Reputation
Badges 1
72 × Eureka!the way I do pagination is wrong
@<1523701070390366208:profile|CostlyOstrich36> might throw some champions tip over here 🙂
you are correct and thank you for the reply @<1523701070390366208:profile|CostlyOstrich36>
going forward, I assume the clearml-server open-source releases will be continue to be released in Docker Hub
looking into ES index events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b
docs.count docs.deleted store.size pri.store.size
2118131043 29352476 265.1gb 265.1gb
sounds we're hitting some ES limitation?
offset = 0
limit = 500
all_data = []
while True:
params = {
'offset': offset,
'limit': limit
}
response = requests.get(url, headers=headers, params=params,verify=False)
data = response.json()
projects = data['data']['projects']
print(f"pulled {len(projects)} projects.")
if len(projects) == 0:
print("no project found - exiting ...")
break
all_data.extend(projects)
offset += limit
print(f"pulled {le...
ohhh severe error here 🙂
I was mixed between other API I worked on .. and did not read carefully the right flag
simple adding page
to the body did the work
thanks again @<1724235687256920064:profile|LonelyFly9>
I'd guess mongo is choking, not sure why
I had slightly similar scenario ~1 year and few versions back
there was some task that wrote a lot of tasks and mongo didn't took it nicely
I was able to identify to it only by questioning users and eventunaly one of them stopped to send and mongo started to come back and all return to normal
we did not come to any wise conclusion what is root cause or how to identify this
app.component.ts:138 ERROR TypeError: Cannot read properties of null (reading 'id')
at projects.effects.ts:60:56
at o.subscribe.a (switchMap.js:14:23)
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
at withLatestFrom.js:26:28
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
at filter.js:6:128
at p._next (OperatorSubscriber.js:13:21)
at p.next (Subscriber.js:31:18)
and I see also when trying to...
if I use the URL directly"
None
I see it (see the list to the left show no data as well )
console showed 401 unauthorized
when I tried it
I tried again now and it magically popped up 🤔
I can see tasks in the project, but nothing for my user
Hey @<1688125253085040640:profile|DepravedCrow61>
should I open issue to follow this up?
we are seeing this bug in almost every task
api calls behaves much better
no problem to query tasks in other projects
to be honest, the use case is mostly convenience
when people train ~5000+ experiments, all saved in few sub folders with long string as experiment name
before publishing a paper for example, we want to move copy small numbers of successful training to separate location and share it with other colleagues/management
I'd guess the alternative can be
changing the name of the successful training under the existing sub folder
using move instead of clone
anything else?
VivaciousPenguin66 your docs was helpful, I got SSL running but my question remained
have you kept needed http services accessible and only running the authentication via https?api_server: "http://<my-clearml-server>:8008" web_server: "
" files_server: "http://<my-clearml-server:8081"
my current state is that the webserver is accessible via http and https, in 8080 & 443
when running in debug and watch the values I get
data = response.json()
projects = data['data']['projects']
all_data.extend(projects)
in each loop iterationprojects
are same 500 valuesall_data
gets append for same 500 values in endless loop
I have bug in my code and can't find where just yet
in case this will help someone else, I did not had root access to the training machine to add the cert to store
you can point your python to your own CA using:
export CURL_CA_BUNDLE=/path/to/CA.pem
trying to use projects.get_all
to pull all my projects into single file
and there are more then 500 ...
I think this is the right approach, let me have a deeper look
thanks @<1724235687256920064:profile|LonelyFly9>
I'm looking at iptables configuration that was done by other teams
trying to find which rule blocks clearml
(all worked when iptables disabled)
I know the 500 limit and using it
but my while
loop keeps pulling the same 500 ... and running endless
OK I got everything to work
I think this script can be useful to other people and will be happy to share
@<1523701070390366208:profile|CostlyOstrich36> is there some repo I fork and contribute?
let me dig in more and hopefully can share successful results
thanks!
@<1523701087100473344:profile|SuccessfulKoala55> looks OK (?)
>>> StorageHelper.get(Task._get_default_session().get_files_server_host())._container.session.verify
InsecureRequestWarning: Certificate verification is disabled! Adding certificate verification is strongly advised. See:
True
@<1523701435869433856:profile|SmugDolphin23> working! here is what I have on Fedora/RHEL
- copy certs to
/etc/pki/ca-trust/source/anchors/
update-ca-trust
so I have large json, with list of task id's
which I want to delete in bulk
API is doable
how about the sdk? how do I provide a list of tasks id's to for deletion
from the cleanup example:
for task in tasks:
try:
deleted_task = Task.get_task(task_id=task.id)
deleted_task.delete(
how do I set tasks
, while coming from known list of task id's
not sure it's same use case but I will begin to ask around people
if you have any other hint/way how to query mongo and look for potential culprit - will be glad to hear