
Reputation
Badges 1
75 × Eureka!looks like the same as in server_info
yeah, even then it'll run but return 0 notebooks
curious whether it impacts anything besides sagemaker. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet
and the only calls to "uri": "/api/sessions"
are the ones I made during testing - sagemaker doesn't seem to ever call that itself
I've poked around both the internal URL that Jupyter kernel is running on and some of the files in /sagemaker/.jupyter
but no luck so far - I can find plenty of kernel info, but not session
so notebook path is empty
sh-4.2$ cat /var/log/studio/kernel_gateway.log | head -n10
{"__timestamp__": "2023-02-23T21:48:28.036559Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0012829303741455078, "method": "GET", "uri": "/api", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.111068Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.00128793...
if I use the same kernel there'll be two
environ{'PYTHONNOUSERSITE': '0',
'HOSTNAME': 'gfp-science-ml-t3-medium-d579233e8c4b53bc5ad626f2b385',
'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/_sagemaker-instance-credentials/xxx',
'JUPYTER_PATH': '/usr/share/jupyter/',
'SAGEMAKER_LOG_FILE': '/var/log/studio/kernel_gateway.log',
'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tmp/miniconda3/condabin:/tmp/anaconda3/condabin:/tmp/miniconda2/condabin:/tmp/anaconda2/condabin'...
right now I can't figure out how to get the session in order to get the notebook path
As another test I ran Jupyter Lab locally using the same custom Docker container that we're using for Sagemaker Studio, and it works great there, just like the native local Jupyter Lab. So it's seemingly not the image, but maybe something to do with how Studio runs it as a kernel.
as best I can tell it'll only have one .ipynb in $HOME
with this setup, which may work...
the CLEARML_*
variables are all explained here: None
and $QUEUE and $NUM_WORKERS are particular to my setup, but they just give the name of the queue and how many copies of the agent to run
I'm doing that and it's working well
the key point is you just loop through the number of workers, set a unique CLEARML_WORKER_ID for each, and then run it in the background
here's my script:
#!/bin/bash
echo "******************** Starting Agent ********************"
echo "******************** Getting ENV Variables ********************"
source /etc/profile.d/env-vars.sh
# test that we can access the API
echo "******************** Waiting for ${CLEARML_API_HOST} connectivity ********************"
curl --retry 10 --retry-delay 10 --retry-connrefused ${CLEARML_API_HOST}/debug.ping
# start the agent
for i in $(seq 1 ${NUM_WORKERS})
do
export CLEARML_WORK...
so notebooks
ends up empty
it does return kernels, just not sessions
the problem is here: None
I can get it to run up to here: None
weird that it won't return that single session
lots of things like {"__timestamp__": "2023-02-23T23:49:23.285946Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007679462432861328, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
and cat /var/log/studio/kernel_gateway.log | grep ipynb
comes up empty