I've poked around both the internal URL that Jupyter kernel is running on and some of the files in /sagemaker/.jupyter
but no luck so far - I can find plenty of kernel info, but not session
Hi LittleReindeer37
Yes you are correct it should capture the entire jupyter notebook in sagemaker studio.
Just verifying this is the use case, correct ?
At the top there should be the URL of the notebook (I think)
but the only exception handler is for requests.exceptions.SSLError
Hmm and you are getting empty list for thi one:
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
environ{'PYTHONNOUSERSITE': '0',
'HOSTNAME': 'gfp-science-ml-t3-medium-d579233e8c4b53bc5ad626f2b385',
'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/_sagemaker-instance-credentials/xxx',
'JUPYTER_PATH': '/usr/share/jupyter/',
'SAGEMAKER_LOG_FILE': '/var/log/studio/kernel_gateway.log',
'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tmp/miniconda3/condabin:/tmp/anaconda3/condabin:/tmp/miniconda2/condabin:/tmp/anaconda2/condabin',
'REGION_NAME': 'us-east-1',
'AWS_INTERNAL_IMAGE_OWNER': 'Custom',
'AWS_DEFAULT_REGION': 'us-east-1',
'PWD': '/home/sagemaker-user',
'AWS_REGION': 'us-east-1',
'SHLVL': '1',
'HOME': '/home/sagemaker-user',
'AWS_SAGEMAKER_PYTHONNOUSERSITE': '0',
'AWS_ACCOUNT_ID': 'xxx',
'_': '/opt/.sagemakerinternal/conda/bin/jupyter-kernelgateway',
'LC_CTYPE': 'C.UTF-8',
'KERNEL_LAUNCH_TIMEOUT': '40',
'KERNEL_WORKING_PATH': '',
'KERNEL_GATEWAY': '1',
'JPY_PARENT_PID': '9',
'PYDEVD_USE_FRAME_EVAL': 'NO',
'TERM': 'xterm-color',
'CLICOLOR': '1',
'FORCE_COLOR': '1',
'CLICOLOR_FORCE': '1',
'PAGER': 'cat',
'GIT_PAGER': 'cat',
'MPLBACKEND': '
_inline'}
nice! Just tested it on my end as well, looks like it works!
lots of things like {"__timestamp__": "2023-02-23T23:49:23.285946Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007679462432861328, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
What happens when you call:
from clearml.backend_interface.task.repo import ScriptInfo
print(ScriptInfo._ScriptInfo__legacy_jupyter_notebook_server_json_parsing(None))
but the call to jupyter_server.serverapp.list_running_servers()
does return the server
As another test I ran Jupyter Lab locally using the same custom Docker container that we're using for Sagemaker Studio, and it works great there, just like the native local Jupyter Lab. So it's seemingly not the image, but maybe something to do with how Studio runs it as a kernel.
This is strange, let me see if we can get around it, because I'm sure it worked 🙂
I additionally tried using a Sagemaker Notebook instance, to see if it was the kernel dockerization that Studio uses that was messing things up. But it seems to actually log less information from a Notebook instance vs Studio .
if there are any tests/debugging you'd like me to try, just let me know
sh-4.2$ cat /var/log/studio/kernel_gateway.log | head -n10
{"__timestamp__": "2023-02-23T21:48:28.036559Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0012829303741455078, "method": "GET", "uri": "/api", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.111068Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0012879371643066406, "method": "GET", "uri": "/api/kernels", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.116324Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007715225219726562, "method": "GET", "uri": "/api/terminals", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.272822Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007491111755371094, "method": "GET", "uri": "/api/terminals", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.000795Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 2.539133071899414, "method": "POST", "uri": "/api/kernels", "status": 201}
{"__timestamp__": "2023-02-23T21:48:43.073568Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0013430118560791016, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.469751Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0013761520385742188, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.702549Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0013780593872070312, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.986808Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007445812225341797, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.992860Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.001028299331665039, "method": "GET", "uri": "/api/kernels", "status": 200}
and cat /var/log/studio/kernel_gateway.log | grep ipynb
comes up empty
. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet
The odd thing is that you can access the notebook, but it returns zero kernels ..
I think it just ends up in /home/sagemaker-user/{notebook}.ipynb
every time
if I instead change the request url to f"http://{server_info['hostname']}:{server_info['port']}/api/sessions"
then it gets a 200 response... however , the response is an empty list
the server_info
is
[{'base_url': '/jupyter/default/',
'hostname': '0.0.0.0',
'password': False,
'pid': 9,
'port': 8888,
'root_dir': '/home/sagemaker-user',
'secure': False,
'sock': '',
'token': '',
'url': '
',
'version': '1.23.2'}]
at least in 2018 it returned sessions! None
it does return kernels, just not sessions
So it's seemingly not the image, but maybe something to do with how Studio runs it as a kernel.
Yeah I think that for some reason it fails detecting this is actually jupyter noteboko (not really sure why), Thank you for double checking on the container !!