Reputation
Badges 1
25 × Eureka!It seems something is wrong with the server itself...
We should probably change it so it is more human readable 🙂
where is it running? could you restart all the dockers ? Is it running on your machine?
I wanted to know what the best way to create and register the SSL keys is.
of I see, so basically you need to add it to add nginx with SSL certificates on top of the hosted service (or configure the dockercompose nginx container to add that)
Then you need to add the self signed SSL into any host machine (I'm assuming these are not "valid" SSL certificates generated by a reputable SSL provider)
But generally speaking if you are using self hosted clearml-server on a local machine that n...
Hi @<1552101458927685632:profile|FreshGoldfish34>
self-hosted, you mean the open source ? if so, then yes totally free 🙂
That said I would recommend to have the server inside your VPN, just in case from a security perspective
How do I best utilize clearml in this scenario such that any coworker of mine is able to reproduce my work with the same pipeline?
Basically this sounds to me like proper software developemnt design (i.e. the class vs stages).
In order to make sure Anyone can reproduce it, you mean anyone can rerun the "pipeline" ? If this is the case just add Task.init (maybe use a specific Task type) and the agents will make sure this is Fully reproducible.
If you mean the data itself is stored, the...
Hi PanickyMoth78
it was uploading fine for most of the day but now it is not uploading metrics and at the end
Where are you uploading metrics to (i.e. where is the clearml-server) ?
Are you seeing any retry logging on your console ?packages/clearml/backend_interface/metrics/reporter.py", line 124, in wait_for_events
This seems to be consistent with waiting for metrics to be flushed to the backend, but usually you will see retry messages on your console when that happens
Hmm, I still wonder what is the "correct" answer for most people, is empty string in argparse redundant anyhow? will someone ever use it?
DistressedGoat23
you can now access the weights model objectpip install 1.8.1rc0
then:
` def callback(_, model_info):
model_info.weights_object # this is your xgboost object
model_info.name = "my new name"
return model_info
WeightsFileHandler.add_pre_callback(callback) `
Welp, it's been a day with the new settings, and stats went up 140K for API calls
... going to check again tomorrow to see if any of that was spill over from yesterday
140K calls a day, how often are you sending scalars ? how long is it running? how many experiments are running ?
I want pipeline / task dispatch to be reported and monitored outside of clearml. For example, I might want to log the dispatch event in some non-clearml system and then monitor the health of the pipeline and alert if if it is pending for too long.Hmm interesting, so like a callback?!
I'm thinking a callback is being executed after the Pipelines is sent, but once the callback is done, the pipeline process leaves?
Does that make sense ?
I might want to dispatch other jobs from within the same p...
I theory this would be doable, but wouldn't it be a bit confusing? Also why not always use containers if the host supports it, there is no real downside, just set the default docker image to something that is a good starting point
JitteryCoyote63 fix should be pushed later today 🙂
Meanwhile you can manually add the Task.init() call to the original script at the top, it is basically the same 🙂
You should have a download button when you hover over the table, I guess that would be the easiest.
If needed I can send an SDK code but unfortunately there is no single call for that
PleasantGiraffe85
it took the repo from the cache. When I delete the cache, it can't get the repo any longer.
what error are you getting ? (are we talking about the internal repo)
Hi SkinnyPanda43
Can you attache the full log?
Clearml agent is installed before your requirements.txt , at least in theory it should not collide
Hi ElegantCoyote26
is there a way to get a Task's docker container id/name?
you mean like Task.get_task("task_id_here").get_base_docker()
?
ow a Task's results page also has a plot for this, but I guess it's at the machine level and not the task level?
This is actually on the container level, meaning checked from inside the container. It should be what you are looking for
I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.
UnevenDolphin73 you mean like as to get the Task object from it?
(This might be doable, the main issue would be the metrics / logs loading)
What would be the use case for the testing ?
CheerfulGorilla72
yes, IP-based access,
hmm so this is the main downside of using IP based server, the links (debug images, models, artifacts) store the full URL (e.g. http://IP:8081/ http://IP:8081/... ) This means if you switched IP they will no longer work. Any chance to fix the new server to the old IP?
(the other option is somehow edit the DB with the links, I guess doable but quite risky)
ImmensePenguin78 this is probably for a different python version ...
Both are fully implemented in the enterprise version. I remember a few medical use cases, and I think they are working on publishing a blog post on it, not sure. Anyhow I suggest you contact the sales people and I'm sure they will gladly setup a call/demo/PoC.
https://allegro.ai/enterprise/#contact
Are you saying you had that odd script entry-point created by calling Task.init? (To clarify this is the problem)
Btw after you clone the experiment you can always manually edit both entry point and working dir, which based on what you said should be "script.py" and "folder"
Hi AstonishingRabbit13
is there option to omit the task_id so the final output will be deterministic and know prior to the task run?
Actually no 😞 the full path is unique for the run, so you do not end up overwriting models.
You can get the full path from the UI (Models Tab) or programmatically with Models.query_models or using the Task.get_task methods.
What's the idea behind a fixed location for the model?
If you choose between skipping or logging like nan, then here I find it difficult, it seems that it is better to log than skip, but you need to think.
So I "think" the issue is plotly (UI), doesn't like NaN and also elastic (storing the scalar) is not a NaN fan. We need to check if they both agree on the representation, that it should be easy to fix...
Maybe you could open a github issue, so we do not forget?