Reputation
Badges 1
75 × Eureka!and $QUEUE and $NUM_WORKERS are particular to my setup, but they just give the name of the queue and how many copies of the agent to run
which I looked at previously to see if I could import sagemaker.kg or kernelgateway or something, but no luck
as best I can tell it'll only have one .ipynb in $HOME with this setup, which may work...
so notebook path is empty
weird that it won't return that single session
so notebooks ends up empty
but the only exception handler is for requests.exceptions.SSLError
I'm doing that and it's working well
but maybe that doesn't matter, actually - it might be one session per host I guess
here's my script:
#!/bin/bash
echo "******************** Starting Agent ********************"
echo "******************** Getting ENV Variables ********************"
source /etc/profile.d/env-vars.sh
# test that we can access the API
echo "******************** Waiting for ${CLEARML_API_HOST} connectivity ********************"
curl --retry 10 --retry-delay 10 --retry-connrefused ${CLEARML_API_HOST}/debug.ping
# start the agent
for i in $(seq 1 ${NUM_WORKERS})
do
export CLEARML_WORK...
Yes, I'm running a notebook in Studio. Where should it be captured?
And then we want to compare backtests or just this week's estimates across multiple of those models/branches
the key point is you just loop through the number of workers, set a unique CLEARML_WORKER_ID for each, and then run it in the background
sh-4.2$ cat /var/log/studio/kernel_gateway.log | head -n10
{"__timestamp__": "2023-02-23T21:48:28.036559Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0012829303741455078, "method": "GET", "uri": "/api", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.111068Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.00128793...
but even then the sessions endpoint is still empty
those look like linear DAGs to me, but maybe I'm missing something. I'm thinking something like the map operator in Prefect where I can provide an array of ["A", "B", "C"] and run the steps outlined with dotted lines independently for each of those are arguments
if I add the base_url it's not found
Just ran the same notebook in a local Jupyter Lab session and it worked as I expected it might, saving a copy to Artifacts
if I change it to 0.0.0.0 it works
the CLEARML_* variables are all explained here: None