Hi @<1532532498972545024:profile|LittleReindeer37>
Yes you are correct it should capture the entire jupyter notebook in sagemaker studio.
Just verifying this is the use case, correct ?
Yes, I'm running a notebook in Studio. Where should it be captured?
As in, which tab when I'm viewing the Experiment should I see it on? Should it be code, an artifact, or something else?
Just ran the same notebook in a local Jupyter Lab session and it worked as I expected it might, saving a copy to Artifacts
I additionally tried using a Sagemaker Notebook instance, to see if it was the kernel dockerization that Studio uses that was messing things up. But it seems to actually log less information from a Notebook instance vs Studio .
Yep I think you are correct, you should have had the same output as a local jupyter notebook, and it seems that in sagemaker studio it is not working 😞
Let me check something
if there are any tests/debugging you'd like me to try, just let me know
As another test I ran Jupyter Lab locally using the same custom Docker container that we're using for Sagemaker Studio, and it works great there, just like the native local Jupyter Lab. So it's seemingly not the image, but maybe something to do with how Studio runs it as a kernel.
So it's seemingly not the image, but maybe something to do with how Studio runs it as a kernel.
Yeah I think that for some reason it fails detecting this is actually jupyter noteboko (not really sure why), Thank you for double checking on the container !!
poking around a little bit, and clearml.backend_interface.task.repo.scriptinfo.ScriptInfo._get_jupyter_notebook_filename()
returns None
but the call to jupyter_server.serverapp.list_running_servers()
does return the server
the server_info
is
[{'base_url': '/jupyter/default/',
'hostname': '0.0.0.0',
'password': False,
'pid': 9,
'port': 8888,
'root_dir': '/home/sagemaker-user',
'secure': False,
'sock': '',
'token': '',
'url': '
',
'version': '1.23.2'}]
and that requests.get()
throws an exception:
ConnectionError: HTTPConnectionPool(host='default', port=8888): Max retries exceeded with url: /jupyter/default/api/sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7ba9cadc30>: Failed to establish a new connection: [Errno -2] Name or service not known'))
but the only exception handler is for requests.exceptions.SSLError
if I instead change the request url to f"http://{server_info['hostname']}:{server_info['port']}/api/sessions"
then it gets a 200 response... however , the response is an empty list
api/kernels
does report back the active kernel, but doesn't give notebook paths or anything
@<1532532498972545024:profile|LittleReindeer37> nice!!! 😍
Do you want to PR? it will be relatively easy to merge and test, and I think that they might even push it to the next version (or worst case quick RC)
right now I can't figure out how to get the session in order to get the notebook path
seems like it's using None and that doesn't provide the normal api/sessions
endpoint - or, it does, but returns an empty list
right now I can't figure out how to get the session in order to get the notebook path
you mean the code that fires "HTTPConnectionPool" ?
I've poked around both the internal URL that Jupyter kernel is running on and some of the files in /sagemaker/.jupyter
but no luck so far - I can find plenty of kernel info, but not session
What do you have in "server_info['url']" ?