yeah, even then it'll run but return 0 notebooks
curious whether it impacts anything besides sagemaker. I'm thinking it's generically a kernel gateway issue, but I'm not sure if other platforms are using that yet
Try to add here:
None
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
and the only calls to "uri": "/api/sessions"
are the ones I made during testing - sagemaker doesn't seem to ever call that itself
I've poked around both the internal URL that Jupyter kernel is running on and some of the files in /sagemaker/.jupyter
but no luck so far - I can find plenty of kernel info, but not session
Hi @<1532532498972545024:profile|LittleReindeer37>
Yes you are correct it should capture the entire jupyter notebook in sagemaker studio.
Just verifying this is the use case, correct ?
sh-4.2$ cat /var/log/studio/kernel_gateway.log | head -n10
{"__timestamp__": "2023-02-23T21:48:28.036559Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0012829303741455078, "method": "GET", "uri": "/api", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.111068Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0012879371643066406, "method": "GET", "uri": "/api/kernels", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.116324Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007715225219726562, "method": "GET", "uri": "/api/terminals", "status": 200}
{"__timestamp__": "2023-02-23T21:48:39.272822Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007491111755371094, "method": "GET", "uri": "/api/terminals", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.000795Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 2.539133071899414, "method": "POST", "uri": "/api/kernels", "status": 201}
{"__timestamp__": "2023-02-23T21:48:43.073568Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0013430118560791016, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.469751Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0013761520385742188, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.702549Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0013780593872070312, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.986808Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.0007445812225341797, "method": "GET", "uri": "/api/kernels/6ba227af-ff2c-4b20-89ac-86dcac95e2b2", "status": 200}
{"__timestamp__": "2023-02-23T21:48:43.992860Z", "__schema__": "sagemaker.kg.request.schema", "__schema_version__": 1, "__metadata_version__": 1, "account_id": "", "duration": 0.001028299331665039, "method": "GET", "uri": "/api/kernels", "status": 200}
if I use the same kernel there'll be two
Hi @<1532532498972545024:profile|LittleReindeer37> @<1523701205467926528:profile|AgitatedDove14>
I got the session with a bit of "hacking".
See this script:
import boto3, requests, json
from urllib.parse import urlparse
def get_notebook_data():
log_path = "/opt/ml/metadata/resource-metadata.json"
with open(log_path, "r") as logs:
_logs = json.load(logs)
return _logs
notebook_data = get_notebook_data()
client = boto3.client("sagemaker")
response = client.create_presigned_domain_url(
DomainId=notebook_data["DomainId"],
UserProfileName=notebook_data["UserProfileName"]
)
authorized_url = response["AuthorizedUrl"]
authorized_url_parsed = urlparse(authorized_url)
unauthorized_url = authorized_url_parsed.scheme + "://" + authorized_url_parsed.netloc
with requests.Session() as s:
s.get(authorized_url)
print(s.get(unauthorized_url + "/jupyter/default/api/sessions").content)
Basically, we can get the session directly from AWS, but we need to be authenticated.
One way I found was to create a presigned url through boto3, by getting the domain id and profile name from a resoure-metadata file that is found on the machine None .
Then use that to get the session...
Maybe there are some other ways to do this (safer), but this is a good start. We know it's possible
environ{'PYTHONNOUSERSITE': '0',
'HOSTNAME': 'gfp-science-ml-t3-medium-d579233e8c4b53bc5ad626f2b385',
'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/_sagemaker-instance-credentials/xxx',
'JUPYTER_PATH': '/usr/share/jupyter/',
'SAGEMAKER_LOG_FILE': '/var/log/studio/kernel_gateway.log',
'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tmp/miniconda3/condabin:/tmp/anaconda3/condabin:/tmp/miniconda2/condabin:/tmp/anaconda2/condabin',
'REGION_NAME': 'us-east-1',
'AWS_INTERNAL_IMAGE_OWNER': 'Custom',
'AWS_DEFAULT_REGION': 'us-east-1',
'PWD': '/home/sagemaker-user',
'AWS_REGION': 'us-east-1',
'SHLVL': '1',
'HOME': '/home/sagemaker-user',
'AWS_SAGEMAKER_PYTHONNOUSERSITE': '0',
'AWS_ACCOUNT_ID': 'xxx',
'_': '/opt/.sagemakerinternal/conda/bin/jupyter-kernelgateway',
'LC_CTYPE': 'C.UTF-8',
'KERNEL_LAUNCH_TIMEOUT': '40',
'KERNEL_WORKING_PATH': '',
'KERNEL_GATEWAY': '1',
'JPY_PARENT_PID': '9',
'PYDEVD_USE_FRAME_EVAL': 'NO',
'TERM': 'xterm-color',
'CLICOLOR': '1',
'FORCE_COLOR': '1',
'CLICOLOR_FORCE': '1',
'PAGER': 'cat',
'GIT_PAGER': 'cat',
'MPLBACKEND': '
_inline'}
right now I can't figure out how to get the session in order to get the notebook path
As another test I ran Jupyter Lab locally using the same custom Docker container that we're using for Sagemaker Studio, and it works great there, just like the native local Jupyter Lab. So it's seemingly not the image, but maybe something to do with how Studio runs it as a kernel.
as best I can tell it'll only have one .ipynb in $HOME
with this setup, which may work...
and this
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/{server_info['base_url']}/"
At the top there should be the URL of the notebook (I think)
it does return kernels, just not sessions
Hmm and you are getting empty list for thi one:
server_info['url'] = f"http://{server_info['hostname']}:{server_info['port']}/"
weird that it won't return that single session